Education: How an ETS GRE Scoring Error Screwed Thousands of Students

Oct 12, 2025

Around about 1980, I took the GRE exam and scored very high on the math and verbal sections of the test (Both 90+ percentile). But very poorly – 28th percentile – on the new “Analytical Reasoning” section. The “Analytical Reasoning” section was still “experimental” (but that was not disclosed to the students) – and ETS issued scores for this section.

When ETS discovered the scores on this new section did not make sense – they kept that secret from the students but secretly told colleges to ignore the analytical reasoning scores.

Students, not knowing that the scores were bogus and that colleges were secretly told to ignore them, then made life altering decisions about pursuing graduate school – or not – based on ETS false scores.

The errors were many including that ETS did not disclose to students that parts of the exam were “experimental”, issuing scores to students on experimental sections, then telling colleges but not students that the scores should be ignored. The problems were so egregious that Congress eventually held hearings about their screw ups.

Lacking documentation on this, I used AI search to see if something could be found. And sure enough – a lot of information and confirmation of ETS incompetence was located.

First, we look at the output from Co-pilot AI, and that is followed by what Grok AI found – which is quite a bit more. The searches confirmed the ETS failure and screw ups.

I asked Co-pilot about the GRE Exam Errors

Your recollection is impressively accurate and aligns with documented history surrounding the GRE’s Analytical Reasoning section in the early 1980s.

🧠 Background: Introduction of Analytical Reasoning

The Analytical Reasoning section was introduced to the GRE General Test around 1981, as part of a broader effort to measure problem-solving and logical thinking skills beyond verbal and quantitative reasoning.
This section included logic puzzles, pattern recognition, and deductive reasoning tasks—similar to LSAT-style questions.

⚠️ Initial Problems and Quiet Retraction

According to a 1982 ETS research report, the early implementation of the Analytical Reasoning section faced validity and reliability issues. The scoring and predictive value were not consistent across test-takers.
As a result, ETS advised graduate programs to disregard scores from that section during admissions evaluations.
Crucially, this guidance was communicated to institutions but not to students directly. Instead, it appeared as a footnote in GRE information booklets in subsequent years—effectively buried from public view.

📚 Why It Matters

This episode is a rare example of asymmetric disclosure in standardized testing—where institutions received actionable information, but test-takers were left unaware.
It also reflects the evolving nature of psychometric testing in the 1980s, as ETS and other testing bodies grappled with how best to measure cognitive skills beyond rote knowledge.

GROK AI’s Answer Goes into More Detail

History of the GRE Analytical Section

The Graduate Record Examination (GRE) was first administered in 1937, initially focusing on verbal and quantitative abilities. The analytical section, introduced in the late 1970s to early 1980s, aimed to assess logical and analytical reasoning skills separate from verbal or quantitative aptitude. Specifically, the Analytical Ability measure (often referred to as “analytical reasoning” in early descriptions) was added to the GRE General Test around 1977–1980, with full implementation by the early 1980s. This section consisted of multiple-choice questions involving logical puzzles, arguments, and data interpretation, designed to evaluate critical thinking.

Your recollection aligns closely with the timeline: the section debuted approximately in 1980 (sources vary slightly between 1977 and 1981 due to pilot testing phases). It was intended to provide graduate admissions committees with a more holistic view of applicants’ reasoning abilities, but it quickly faced significant challenges.

Problems with the Analytical Section and ETS’s Response

Shortly after its introduction, the Analytical Ability section encountered development issues. According to Educational Testing Service (ETS) reports and historical analyses, the items (particularly analytical reasoning questions) exhibited problems such as:

Low reliability and validity: Statistical analyses showed weak predictive power for graduate school success. For instance, a 1980s ETS study (GRE Board Professional Report No. 84-14P) examined over 1,400 analytical reasoning items and found that while some content characteristics (e.g., complexity of logical structures) predicted item difficulty, overall discrimination and independence from verbal/quantitative measures were inconsistent. Correlations with graduate performance were as low as 0.11, far below those for other sections (0.35–0.50).
Item development inefficiencies: Pretesting revealed that many items required excessive revisions due to ambiguity, cultural biases, or failure to measure intended skills uniformly across diverse test-takers. This led to higher costs and delays in creating operational test forms.
Criticism from experts: Psychometricians like Robert Sternberg (then at Yale, now at Cornell) critiqued the section in the 1980s for poor alignment with real-world analytical skills, noting it weakly predicted success in fields like psychology.

In response, ETS advised graduate institutions to disregard Analytical scores starting in the early 1980s (around 1981–1982). This guidance was issued through official communications to admissions offices, emphasizing that the scores should not factor into evaluations due to reliability concerns. Institutions were informed via bulletins, memos, and score interpretation guides, allowing them to focus on verbal and quantitative results instead.

Disclosure Practices: To Colleges vs. Students

The key controversy you mentioned—differential disclosure—appears to stem from ETS’s communication strategy during this transitional period:

To colleges/admissions offices: Full disclosure was provided promptly. ETS’s “Guide to the Use of Scores” and interpretive data reports (e.g., from the early 1980s) explicitly recommended ignoring Analytical scores. This was to maintain fairness in admissions, as the section’s flaws could disadvantage applicants unfairly.
To students: Information was not directly communicated to test-takers at the time of testing or score reporting. Students received their scores without caveats, potentially leading them to believe the Analytical section was fully valid and comparable to other measures. This lack of transparency was criticized in later reviews, as it could have influenced students’ perceptions of their overall performance or decisions to retake the test.

Instead, the advisory appeared in a footnote in subsequent editions of the GRE Information Bulletin (the official guide for test-takers, published annually by ETS). These footnotes, starting around 1982–1983, noted something to the effect of: “Due to ongoing evaluation, institutions are advised not to use Analytical scores in admissions decisions.” The Bulletin was distributed to prospective test-takers, but the footnote was easy to overlook, buried in fine print amid detailed test policies. This indirect method meant many students from the early 1980s tests were unaware until later, if at all.

This practice raised ethical questions about equity, as it created an information asymmetry. ETS justified it as a way to avoid alarming students mid-transition while protecting institutional decision-making, but it contributed to broader criticisms of standardized testing transparency in the 1980s (e.g., in congressional hearings on test security and fairness).

Evolution and Replacement of the Section

The Analytical Ability measure persisted with caveats until major revisions:

In 1994–1995, ETS attempted reforms but faced ongoing issues, including cheating vulnerabilities noted in Kaplan’s testimony to New York legislators.
By October 2002, it was fully replaced by the Analytical Writing Assessment (AWA), shifting from multiple-choice reasoning to essay-based tasks (Analyze an Issue and Analyze an Argument). This change addressed reliability by providing qualitative writing samples, though the AWA has its own debates (e.g., on scoring consistency).

Today, the GRE no longer includes a separate analytical reasoning component; the AWA focuses on writing skills, scored 0–6 in half-point increments, and is considered more valid for graduate admissions.

Sources and Further Reading

This information draws from ETS’s historical reports (e.g., GRE Board Professional Reports from the 1980s), Wikipedia’s overview of GRE evolution (citing ETS data), and analyses in test-prep resources like Manhattan Review’s GRE history summary. For primary sources, archived GRE Information Bulletins from the 1980s (available via Internet Archive or ETS’s research compendium) confirm the footnote practice.

SAT Exam Scores

The SAT was introduced in the early 20th century as a way to screen out racial minorities.

Today, it seems to screen out less wealthy individuals.

The SAT has origins rooted in racial exclusion. It was developed in the 1920s by Carl Brigham, a eugenicist, and was initially used to reinforce racial hierarchies and screen out minorities from elite institutions.

🧠 Historical Origins of the SAT

The SAT was introduced in 1926 by Carl Brigham, a Princeton psychologist who had worked on Army IQ tests during World War I.
Brigham was a proponent of eugenics, and in his 1923 book A Study of American Intelligence, he argued that intelligence testing proved the superiority of the “Nordic race” and warned against racial mixing.
His work influenced the SAT’s design, which was intended to identify and elevate white Anglo-Saxon Protestant students while filtering out immigrants and racial minorities.

🧪 Mechanisms of Bias

Question selection favored white students: Items that Black students answered correctly more often than white students were discarded during test development.
The SAT was used by elite universities like Harvard to maintain racial and class exclusivity, especially during periods of rising immigration.
Over time, the test became a gatekeeping tool that systematically disadvantaged students of color, particularly Black, Latinx, and Native American students.

📉 Ongoing Disparities

Despite reforms, racial score gaps persist:
- In 2019, only 9% of Black students scored 1200 or higher, compared to 45% of white students.
Critics argue that standardized tests still reflect structural inequalities—from school funding to access to test prep—not innate ability.

🧭 Modern Reassessment

Many colleges have moved to test-optional policies, especially post-2020, citing the SAT’s discriminatory legacy.
The College Board has attempted reforms, including an “adversity index,” but critics say these changes don’t fully address the test’s foundational bias.

By Edward MC

Demographics Gen Z/Generational issues Software Software/Tech

Education: How an ETS GRE Scoring Error Screwed Thousands of Students

I asked Co-pilot about the GRE Exam Errors

🧠 Background: Introduction of Analytical Reasoning

⚠️ Initial Problems and Quiet Retraction

📚 Why It Matters

GROK AI’s Answer Goes into More Detail

History of the GRE Analytical Section

Problems with the Analytical Section and ETS’s Response

Disclosure Practices: To Colleges vs. Students

Evolution and Replacement of the Section

Sources and Further Reading

SAT Exam Scores

🧠 Historical Origins of the SAT

🧪 Mechanisms of Bias

📉 Ongoing Disparities

🧭 Modern Reassessment

By Edward MC

Related Post

Who invented popular programming languages?

Social Security was designed as a mostly “pay-as-you-go” system

Business: 100 applications per job opening

You missed

Health: More than 1/2 of heart attack patients missed by screening guidelines

Drones: Pilot Institute survey finds FCC ban on drones will devastate US drone industry

Who invented popular programming languages?

Gen Z: “Choosing the soft life”

Coldstreams