Is there any science in computer science or software engineering?

Revision 15
© 2013-2019 by Zack Smith. All rights reserved.

Question assumptions

When large numbers of people collectively assume something to be true, often these beliefs go unquestioned. But it is logical fallacious to decide something is true just because many people believe it. This fallacy is known as argumentum ad populum. In fact, the wisdom of the crowd is a misnomer; a crowd's collective IQ is often limited by the loudest or most persuasive members, who may have quite nonsensical and unfounded beliefs.

There has been a great deal of hype in the media and academia praising high technology, perhaps because the media and academy are profit-focused institutions and the Tech Sector has produced great wealth and status as of late, albeit at the expense of privacy, free speech, and social order. But the hype shouldn't sway us into believing unfounded assumptions.

One of those assumptions is that Computer Science is a science, which lends a dignifying shine and an air of authority to the subject. But is it a science? Does it deserve that name and the type of respect that comes with it?

Is computer science a science?

No, for the most part, it is not. In order for a discipline to be a science in any genuine or meaningful sense, it has to employ the scientific method. This is what qualifies something as a hard science. A soft science by comparison is one where, like in anthropology or astronomy, experimentation is not possible so observation and analysis are the focus.

Computer scientists largely do not use the scientific method and surely some could not describe it. Computer science is also not primarily about observation. It is therefore not a science in either the hard or soft sense. Better would be to replace the term computer science with computing studies or applied math.

Computer Science as it is practiced is probably about as scientific as

  • Economics
  • Christian science
  • or political science.

And even though software does much to enable scientists to do new kinds of experiments, that does not make computer science a science.

A Stanford lecturer Andy Freeman joked that computer scientists firmly believe in the one true way, which varies based on beliefs and circumstances. While he was joking, there is truth in this. Any student of science knows that such thinking is the opposite of scientific reasoning.

To extend the joke, I will leave it as a homework assignment for the reader to identify some ways in which computer science as practiced is not only not scientific, but is religion-like.

  • For instance, have you ever heard someone refer to the beauty of object oriented programming and show bitter disdain for dissenters?
  • Does the always-increasing complexity of a language, its core libraries, and an operating system automatically create a specialist priesthood that seeks to repel reformers and newbies?

What is the Scientific Method?

It is a brave approach to learning the truth that embraces uncertainty as a challenge rather than a threat, relies on evidence rather than wishes or dogma, questions its own assumptions and conclusions and rejects unprovable claims.

What are the steps of the Scientific Method?

  1. Observe. Formulate a useful question.
  2. Hypothesize.
  3. Predict. If the hypothesis were true, what would result?
  4. Experiment to test the hypothesis.
  5. Analyze. Form a conclusion.

If a hypothesis is proven false, then think hard, question yourself and your methods, devise a new hypothesis and run a new experiment to test it. Rinse and repeat.

Searching for the science in computer science

While CS as a whole is not a science, some bits of it may use the Scientific Method. Other bits may use it informally. In this way it is a bit like the study of psychology.

Examples of science within CS may include

  • penetration testing
  • malware analysis
  • the development of speech recognition
  • the development of genetic algorithms.

These require(d) experimentation, potentially involving clear and explicit hypotheses and controlled experiments. Did researchers follow a rigorous scientific approach? Only they know. Are these not more like engineering endeavors than pure research? Perhaps.

A list of computing-related experiments

This is a section where I hope to list a great many found examples of scientific experiments related to computing, as they become known to me and as time permits.

  • An experiment to assess what job ads are shown to women.
    • Automated Experiments on Ad Privacy Settings by Datta, Tschantz and Datta.
    • Women less likely to be shown ads for high-paid jobs on Google, The Guardian July 2015.
  • Analysis of Stuxnet malware.

Does software engineering employ the scientific method?

There is a handful of situations wherein software engineers do something akin to the Scientific Method.

Software engineering is sometimes described as being an offshoot of Psychology, because it is as much about dealing with solving people problems as it is about solving technological problems. For instance, the Cambridge Analytica scandal resulted from what were in someone's eyes technical successes but people weren't thinking very well about ethics.

Experimentation on opaqueware

Some inherited source code is so convoluted and hard to understand, you might as well call it opaqueware rather than the usual software debt. In the face of absurd complexity, over-engineered code, and/or bad or missing documentation a programmer may have to resort to something rather like the Scientific Method just to make sense of the code.

Observation: Input X leads to output Y.
Hypothesize: There is an internal set of rules Z causes this behavior.
Prediction: Given those rules, other inputs should cause specific outputs.
Experiment: Test the hypothesis using a variety of inputs that conform or break the rule(s).
Analysis: Hypothesis confirmed or dispelled.

Algorithm design

The act of creating a novel algorithm is sometimes done in such a way that the design process involves a succession of micro-experiments.

The engineer first contemplates the problem, and then asks If I write the code this way will it work?. This implies the hypothesis: Doing it this way will produce the correct performance or behavior.

Then the engineer runs experimental code that implements the algorithm on some input to test whether it works. Or he walks through the code mentally and writes down the program output.

If the code does not work (the hypothesis that it will work when written a certain way was false) he must question his assumptions, then tweak the code or rewrite it and run the new micro-experiment.

These micro-experiments are more like creating furniture however than doing proper science.

Observation: The algorithm is not working right. The output is wrong or the running time is excessive.
Hypothesis: It is missing a key feature that I can add.
Prediction: If I change this like so, all the tests will run correctly.
Experiment: Run the code, measure results.
Analysis: It worked or didn't work.


While some debugging does not require experimentation but rather analysis, when an engineer is faced with a mystifying bug sometimes it is in fact better to treat the code like a black box and then make and test hypotheses. Don't assume you can guess the problem. Assuming may actually delay you longer than not assuming.

I am wise because I know I know nothing.
   - Socrates

When you ASSUME, you make an ASS out of U and ME!
   - Benny Hill

For instance, someone claimed the code will recognize a sensor that comes into range. That's an assumption, not a proven fact. A simple experiment may prove or disprove it, and additional hypotheses and experiments can flesh out the range of circumstances where the basic claim holds.

Reproducible builds

In order to establish reliably that the software a user receives is solely based on the software that the programmers wrote, and does not include any spyware or other malware that was introduced by a middleman it is possible to leverage the crowd to verify software by regenerating builds on many machines to be compared with the software builds that project maintainers actually distributed.

However in order for builds to be reproducible, software has to be readily buildable, unlike something like e.g. the Tor Browser.

Observation: We need reproducible builds.
Hypothesis: Our software does not have any spyware added to it e.g. on corrupted build computers.
Prediction: When other people build the software from the same source code, they will get the same checksums as we did on the build computer.
Experiment: Have many people rebuild the software and compare checksums to that of the original.
Analysis: The build computer is corrupted or not.

Regression testing

The Q/A personnel have a simple question to answer along the lines of: Did anything break when the engineers made changes to module X of the software at 2am? The experiment is to simply run through a list of program behaviors for the affected code. This proves or disproves the hypothesis.

Observation: Software changed in module X.
Hypothesis: The software did get broken.
Prediction: When we do these 20 different tests, some will definitely show incorrect results.
Experiment: Run the tests.
Analysis: It's broken or not.

A/B testing

These are UI experiments run by UI/UX designers, and simply implemented by engineers. Certainly A/B testing is an example of actual science if done properly. They contemplate their users' psychology, they design an experiment and they run the experiment. Hypotheses like This background color will spur more online sales are proven or disproven. Conclusions are drawn, such as warm colors may increase sales.

Observation: People respond to color emotionally.
Hypothesis: A blue background on the checkout cart webpage will help sell more products.
Prediction: Assigning a blue background to half of the users' checkout pages instead of the usual white background will show greater sales for users who get the blue background.
Experiment: Randomly change 50% of users' backgrounds to blue, leaving the rest red.
Analysis: Sales to blue-background users were higher by 5%, so it appears the hypothesis is proven correct.