Category Archives: Software Quality

Public Health England lost about 16,000 Covid-19 positive tests due to not understanding Excel spreadsheet limitations

The BBC has confirmed the missing Covid-19 test data was caused by the ill-thought-out use of Microsoft’s Excel software. Furthermore, PHE was to blame, rather than a third-party contractor.

Source: Covid: Test error ‘should never have happened’ – Hancock – BBC News

Could have happened to anyone. What’s 16,000 lost positive test results among friends, anyway?

Public health says it has no data on what works and what does not work

Oregon has released its periodic modeling update report.

Shockingly, they say they have no data on what measures work or do not work, or whether or not anyone is adhering to them. They have no information on whether some measures work better or worse than others. They have no data on any measures at all.

Continue reading Public health says it has no data on what works and what does not work

It’s worse than we thought: “Second Analysis of Ferguson’s Model”

In the past I had some comments on Neil Ferguson’s disease model and have repeatedly noted its poor quality. This model was used, last spring, as the basis for setting government policies to respond to Covid-19. Like many disease models, its output was garbage, unfit for any purpose.

The following item noted that the revision history, since last spring, is available and shows that ICL has not been truthful about the changes made to the original model code.

Source: Second Analysis of Ferguson’s Model – Lockdown Sceptics

THIS! Many academic models including disease models and climate models, average the outputs from multiple runs, some how imaginatively thinking that this produces a reliable projection – uh, no, it does not work that way.

An average of wrong is wrong.  There appears to be a seriously concerning issue with how British universities are teaching programming to scientists. Some of them seem to think hardware-triggered variations don’t matter if you average the outputs (they apparently call this an “ensemble model”).

Averaging samples to eliminate random noise works only if the noise is actually random. The mishmash of iteratively accumulated floating point uncertainty, uninitialised reads, broken shuffles, broken random number generators and other issues in this model may yield unexpected output changes but they are not truly random deviations, so they can’t just be averaged out.

Software quality assurance is often missing in academic projects that are used for public policy:

For standards to improve academics must lose the mentality that the rules don’t apply to them. In a formal petition to ICL to retract papers based on the model you can see comments “explaining” that scientists don’t need to unit test their code, that criticising them will just cause them to avoid peer review in future, and other entirely unacceptable positions. Eventually a modeller from the private sector gives them a reality check. In particular academics shouldn’t have to be convinced to open their code to scrutiny; it should be a mandatory part of grant funding.

The deeper question here is whether Imperial College administrators have any institutional awareness of how out of control this department has become, and whether they care. If not, why not? Does the title “Professor at Imperial” mean anything at all, or is the respect it currently garners just groupthink?

When a software model – such as a disease model – is used to set public policies that impact people’s lives – literally life or death – these models should adhere to standards for life-safety critical software systems. There are standards for, say, medical equipment, or nuclear power plant monitoring systems, or avionics – because they may put people’s lives at risk. A disease model has similar effects – and hacked models that adhere to no standards have no business being used to establish life safety critical policies!

I and another software engineer had an interaction with Gavin Schmidt of NASA regarding software quality assurance of their climate model or paleoclimate histories[1]. He noted they only had funding for 1/4 of a full time equivalent person to work on SQA – in other words, they had no SQA. Instead, their position was that the model’s output should be compared to others. This would be like – instead of testing, Microsoft would judge its software quality by comparing the output of MS Word to the output of another word processor. In other words, sort of a quailty-via-proxy analogy. Needless to say, this is not how SQA works.

Similarly, the climate model community always averages multiple runs from multiple models to create projections. They do this even when some of the model projections are clearly off the rails. Averaging many wrongs does not make a right.

[1] Note that NASA does open source their software which enables more eyes to see the code, and I do not mean to pick on NASA or Schmidt here. They are doing what they can within their funding limitations. The point, however is that SQA is frequently given short shrift in academic-like settings.

What if you could be convicted with secret evidence you cannot see nor contest?

All defendants have a right to review the evidence before them. When software applications produce a conclusion, then the software source code must be re-viewable by the defense.

The government argues it can use secret software against a defendant – software that may very well be defective (think Neil Ferguson’s Imperial College London’s secret disease modeling code that ignores all modern software engineering practices).

Can secret software be used to generate key evidence against a criminal defendant?

Source: EFF and ACLU Tell Federal Court that Forensic Software Source Code Must Be Disclosed | Electronic Frontier Foundation

Twitter still a mess in the aftermath of the take over of their systems

Read the whole thing  –  Twitter blog post update.

It’s pretty clear they still do not have a full handle on the situation.

Twitter acknowledges that the hackers downloaded the Twitter Data for some accounts,  which may include private Direct Messages.

I no longer regard Twitter as safe. I deactivated 2 of my 4 accounts, and had already deleted all content of my main account – except DMs. I’m in process of clearing out all the DM’s now. I intend to keep one or two of the accounts alive but will probably no longer use them.

This incident was a total and complete failure of Twitter security and their ability to be trusted with holding information. At this time, no one should have any trust in Twitter – and I mean no one. Clear your data as soon as possible. What just happened could have created one or more international incidents as hackers seized control of prominent political accounts.

Japan pulls its coronavirus tracking smartphone app due to software design errors

The Japanese government has pledged to fix within a week bugs that have caused its coronavirus contact-tracing smartphone app to be shut down, the health minister said Tuesday.

The free app, which was launched Friday and downloaded around 3.71 million times as of Tuesday morning, erroneously accepts ID numbers not issued by the Health, Labor and Welfare Ministry, Katsunobu Kato, the minister responsible for the system, said at a press conference.

Source: Bugs force Japan gov’t to temporarily shut down virus contact-tracing app

Experts criticize ICL’s Ferguson’s Covid SIM model as garbage

Those of us who have seen Neil Ferguson’s ICL Covid sim model have the same views as this computational epidemiologist:

As Ferguson himself admits, the code was written 13 years ago, to model an influenza pandemic. This raises multiple questions: other than Ferguson’s reputation, what did the British government have at its disposal to assess the model and its implementation? How was the model validated, and what safeguards were implemented to ensure that it was correctly applied? The recent release of an improved version of the source code does not paint a favorable picture. The code is a tangled mess of undocumented steps, with no discernible overall structure. Even experienced developers would have to make a serious effort to understand it.

I’m a virologist, and modelling complex processes is part of my day-to-day work. It’s not uncommon to see long and complex code for predicting the movement of an infection in a population, but tools exist to structure and document code properly. The Imperial College effort suggests an incumbency effect: with their outstanding reputations, the college and Ferguson possessed an authority based solely on their own authority. The code on which they based their predictions would not pass a cursory review by a Ph.D. committee in computational epidemiology.

Source: Britain’s Hard Lesson About Blind Trust in Scientific Authorities

Continue reading Experts criticize ICL’s Ferguson’s Covid SIM model as garbage

ICL Covid-simulation source code

I will not comment on Covid-19 and only make a few comments on the publicly available source code.

  • This is not a comment about whether models should be used – or not.
  • This is not a comment about whether this model’s output is correct – or not (we have no way of knowing either way). Even with the model output being off my very large amounts, we still have no way of knowing.
  • This is not a comment on whether there should be a lock down – or not.
  • This is not a comment on whether a lock down is effective – or not.
  • This is a review of a software project.
  • The review findings are typical of what is often seen in academic software projects and other “solo contributor” projects (versus modern “production code” projects). The issues that often arise in academic projects are due to the nature of individuals or small groups, not trained in software, tinkering with software code until it grows out of control. This likely occurs in other fields but seldom do such works become major components of public policy.
  • When software is used for public policy it needs to be publicly reviewed by independent parties. Until the past month, this code had apparently not been reviewed outside of the ICL team.
  • Models are a valuable tool, when properly used and their limitations are understood. A reasonable model can enable planners to play “what if” scenarios and to adjust input parameters and see what might occur. For example, consider a model for complex manufacturing – we might look at productivity measures, inventory, defect rates, costs of defect repairs, costs of dissatisfied customers, impacts on profits and revenue, supplier issues and so on. If we choose to optimize for profit, then we use our model to find optimal values for each parameter to achieve maximum profit. Or perhaps we optimize for customer satisfaction instead -what happens to our profits if we do that? That is a What-if question. For this purpose the model need not be perfect but at least needs to be “within the ball park”. The key is “within the ball park”. If the model flies off the rails in many cases, it is not a good and accurate model and there is a risk we make seriously wrong decisions.
  • A model may also be used to compare scenarios. We may not need precise future projections for that – instead, if we say, increase X, our model shows high profits, but in another run, we decrease X and show losses. We may not need to know the exact dollar value – only that one path leads to profit and one leads to losses. In this way, precise projections are not always essential.

This code – placed on GitHub – is apparently a revision released by the University of Edinburgh, based on the original source code by Neil Ferguson of the Imperial College of London. They are said to have asked Microsoft and others to help and clean it up and fix defects. Consequently, this is not the exact same code that Neil Ferguson was using to create is models two months ago, but code that has been since updated by others.

This code is thousands of lines of very old C programming.

First thing I noticed was how so much code has been placed in one gigantic source file – 5,400 lines in a single source file. Ouch.

This explains much:

The “undocumented C” argument comes when the author is the only one working on a project and sees no need to document their work. After all, it’s just me! There are two problems with this thinking: (1) over time, even personal projects like this one, grow in size until they become thousands of lines of code. Years later, our understanding of our own original code may not be as good as we think it is. We forget why we made particular design choices. We forget why we assumed certain conditions or values. Bottom line: over time, we forget. And (2) personal projects like the Covid-19 simulation eventually became the basis of major public policy and others are asked to review, check or modify the code base. No documentation puts the entire model at risk. This is not the right way to do these kinds of software projects, particularly when this is the basis for advising world leaders on major public policies that impact billions of people.

Continue reading ICL Covid-simulation source code

Software: Why hiring professional software engineers might have been a good idea #IowaCaucus

Oh my:

It wasn’t so much that the new app that the Iowa Democratic Party had planned to use to report its caucus results didn’t work. It was that people were struggling to even log in or download it in the first place. After all, there had never been any app-specific training for his many precinct chairs.

No training? This points to a lack of common sense and systems analysis at the start of the project. How was this missed?

Further, they likely had not created use cases, which would have caught the next set of failures.

So last Thursday Mr. Bagniewski, the chairman of the Democratic Party in Polk County, Iowa’s most populous, decided to scrap the app entirely, instructing his precinct chairs to simply call in the caucus results as they had always done.

The only problem was, when the time came during Monday’s caucuses, those precinct chairs could not connect with party leaders via phone. Mr. Bagniewski instructed his executive director to take pictures of the results with her smartphone and drive over to the Iowa Democratic Party headquarters to deliver them in person. She was turned away without explanation, he said.

Source: ‘A Systemwide Disaster’: How the Iowa Caucuses Melted Down – DNyuz

I live in the state that featured Cover Oregon[1], a $450 million health exchange that never enrolled a single individual subscriber. It was a complete failure. Healthcare.gov received most of the media attention concerning large government failed software projects but several state projects also failed.

Both health exchange fiascos – and the Iowa Caucus disaster – point to over reliance on software and an assumption that more tech is always better. Tech can make things better, but only when qualified people are involved in all aspects of the project.

Update – my guess was correct says the NY Times:

Shadow was also handicapped by its own lack of coding know-how, according to people familiar with the company. Few of its employees had worked on major tech projects, and many of its engineers were relatively inexperienced.

Only 25% of precinct  chairs were able to successfully install the app. Colossal failure. The system relied, in part, on “security by obscurity”, which never works.

Update: “They” have quite a history with  failed software development. The Associated Press said it could not name a winner of the Iowa Democratic Party Caucuses.

Continue reading Software: Why hiring professional software engineers might have been a good idea #IowaCaucus

A call for a code of tech ethics?

Facebook and the like need to craft a professional code of ethics for the technology industry.

Source: A Facebook request: Write a code of tech ethics – Los Angeles Times

Where this is headed, naturally, is the concept of licensed professional engineers (P.E.) in software engineering. Development of a professional engineering licensing exam for software engineering was done many years ago. I believe Texas was the only state to offer the exam; however, due to low participation, they are discontinuing the software engineering PE exam as of April 2019.