What is it about?

Research on SQL, a query language for relational databases, has shown that its users make many mistakes, making them inefficient. The first step to solving this problem is to find the underlying reasons why students make mistakes, we call these misconceptions. To figure out what misconceptions novice SQL users hold, we ran a study in which we asked 21 SQL novices to write queries in SQL while describing their thought process. This is also known as a think-aloud study. From the errors and that our participants made, and the problem solving methods they applied, we identified four categories of misconceptions. The first category is that of previous course knowledge. Every time we learn something new, we build this on top of something we already know. This existing knowledge can be helpful (for example, if you know how to bake an apple pie, it is not so difficult to then make a cherry pie), but it can also be a hindrance. Trying to fry an egg, based on the knowledge of how to boil an egg, will not help you succeed. This is also the case in learning SQL. There are many things that transfer over well, but some things do not, which leads to misconceptions. The second category of misconceptions is that of generalizations. In programming, it is common to teach students patterns: similar problems that can be solved by the same approach. This is straight-forward, but can lead to problems when the pattern is not delimited correctly. For example, one pattern is that when you are in the classroom, you need to listen to the teacher and be quiet. But, when the teacher asks you a question, you should answer the question (and thus not be quiet). This delimitation was not always done correctly for/by our participants. This led them to apply patterns in cases where they do not apply. The third category are language-based misconceptions. SQL syntax is very strict and only accepts words that are predefined by the interpreter. This means that as a user, you need to remember the correct keywords. This was not always the case for our participants, who wrote queries with synonyms of the correct words, leading to queries that will not be executed. A user’s native language also plays a role in query formulation and may hinder query formulation. The fourth and final category is that of incorrect or incomplete mental models. This means that our participants did not completely understand SQL and the interpreter. Suppose you know about traffic lights, and that they have a red, yellow and green light (at least here in The Netherlands). But, you do not know that the lights always turn on in a set order, and instead think that the color is chosen randomly. This is similar to some of the misconceptions of our participants, who could not always judge whether the interpreter could understand the query they had written. For a deeper technical explanation of all 14 misconceptions we found, check out the paper!

Featured Image

Why is it important?

This is the first study of SQL Misconceptions that applies a think-aloud methodology, which gives us the purest insight into the thought process of our participants. Now that we have found a list of misconceptions, we can design interventions to improve SQL education, and to make its users more effective in query formulation. As SQL is ubiquitous in education and practice, this is highly valuable.


This was a really fun study to run, mostly because of the interaction I had with all our student participants. I hope you have gained some insight into where gaps in SQL instruction might be.

Daphne Miedema
Technische Universiteit Eindhoven

Read the Original

This page is a summary of: Identifying SQL Misconceptions of Novices: Findings from a Think-Aloud Study, August 2021, ACM (Association for Computing Machinery),
DOI: 10.1145/3446871.3469759.
You can read the full text:




The following have contributed to this page