# Why My Teenage Code Was Terrible: Sorting Algorithms and Big O Notation

When I was about 15,

I had a week of ‘work experience’. Americans would call that an ‘internship’,

but that’s not really right: work experience is a thing in British schools that basically just gets the older kids

out of the way for a few days and palms them off on local businesses so

the teachers can have a break. I can’t actually remember the name of the

company now, but anyway, they had a problem: they had recently scanned in

a load of photos, all the history of the place,

but those photos weren’t categorised: they just had hundreds, maybe a

couple of thousands of pictures. all in one folder with

arbitrary filenames. And the work experience kid got assigned the

job of going through them all and writing descriptions. “Well, that’s boring”, I thought, because of course I did.

It was really dull. It’s the job you give the

work experience kid, because they’re not disenchanted enough

with the world yet. So in between typing the descriptions, I figured I’d show off and write a system

to search through them. It’d be like Yahoo, or that new Google thing

that was starting to become popular. It’d look great.

I’d look really smart. I didn’t know anything about

how to build that, but with utterly unjustified confidence, I figured it couldn’t be that big a problem. And it wasn’t. Take the search word,

loop through all the descriptions, count how many words match, then return all the matching results,

sorted by the number of matches. And yeah. It worked fine. Until I got over about fifty photographs. At which point it started to

get a little bit slow. And after a hundred,

it started to get really slow. And I started to spend more time trying to

fix it and less time actually doing my job. Actually, I mean, I wasn’t getting paid, so… Anyway, to explain why my code sucked: let’s talk about Big O Notation. Whatever you’re coding, there are almost

certainly many different ways to approach the problem. But even though all those approaches

can end up with the same result, they may take vastly different amounts of

time or memory to get there. And the best example of that is sorting. An animation like this is

a classic of computer science. I remember seeing demonstrations of

things like this on a DOS computer, complete with bleep-bloop sound effects,

when I was tiny. So we’ve got a bunch of different-sized

blocks, we’ve got a list. What algorithms can we use

to put our list in order? Let’s start with something called

bubble sort. Take the first two blocks and compare them. If the first one is longer than the second,

swap them. If not, leave them where they are. Now compare the next two blocks.

Same again. If they’re in the wrong order,

swap them. If they’re not, leave them. And so on, and so on, until we get

all the way to the end of the list. And at that point, we have moved the

biggest block in the list to the very end. And that’s all. And it’s not very efficient,

but it has now been done. So now we can go again from the start: we can compare, swap if we need to, compare, swap if we need to, compare, swap if we need to,

and so on, and so on, and so on, and if we get all the way start-to-finish

without moving a thing, then we know everything is sorted

and we can stop. And if not: we go back to the start. And eventually, after

many, many, many passes, congratulations, we have a sorted list! Bubble Sort, with each block “bubbling up”,

is terrible. It has to check every block in the list on

every pass, and in most cases it’ll have to pass through

the list as many times as there are blocks on it. Ten blocks means ten operations,

performed ten times. A hundred blocks means a hundred operations,

performed a hundred times. The number of operations it has to perform

goes up with the square of the number of blocks. Which means that it has a ‘Big O’ of n². The foundations of Big O notation, plus that

O symbol, were laid down towards the end of the 19th

century by German mathematicians working in

analytic number theory. But it was popularised in computer science by

the great Donald Knuth in the 1970s. “Ka-nooth”?

I should have checked that. Anyway. Big O distills an algorithm down to

a single expression, which indicates how it performs

as you add more blocks, more inputs of any kind. It lets you compare, roughly, approximately, in a big-picture sense, how slow an algorithm will get as you add

more and more complexity. n² means that the processing time goes up

with the square of the number of inputs: double the number of inputs,

and the run-time quadruples. The lowest Big O notation an algorithm can

reasonably have is ‘constant’, O(1). That means that no matter

how much stuff you throw at it, it’s going to return its answer in the

same amount of time. And O(1) is usually something

really simple, like “return the first item in the list”. Some common algorithms have a Big O of n,

or ‘linear’. This means that processing time goes up at

a steady rate with the number of items. So if your code needs to go through a list

and just do one thing to each item in it: that would be linear. Double the items, double the processing time. So let’s try some other sorting algorithms. Here is ‘insertion sort’: start with the

second item, compare it to the item before. If they’re the wrong way round,

swap them. Now take the third item,

and compare it to the one before: if they’re the wrong way round, swap them and then check again

with the one before that. Basically, you keep moving each item in turn

towards the start, checking as you go, until either you find something smaller

and you know it’s in the right place, or you reach the start of the list

and you know it’s the smallest. If you’ve got a hand of cards and

you are putting them in numerical order, this is the algorithm you’re probably using

without even thinking about it. And yes, that is usually faster

than bubble sort. But crucially,

its Big O notation is also n². And this is a really important distinction

to make: if two algorithms have the same Big O, it doesn’t mean they perform the same. It just means that if you draw the graphs

of how they get worse with more inputs, those graphs have the same shape. For both bubble sort and insertion sort, if you double the number of items,

it’ll quadruple the run time: but with insertion sort, that run time is usually

shorter to begin with. Now I’m not going to go into the details of

every sorting algorithm, there are a lot of them, but there are two I want to flag up. The first is Quick Sort. Choose an item on the list

and call it the ‘pivot’. Then go through the list, putting anything smaller than the pivot

somewhere to its left, and anything larger somewhere to its right. Once that’s done, look at just the ones to the left of the pivot

and you do the same, you choose a pivot and you split those. Repeat that divide-and-conquer

all the way down. And then do the same to the other side of each

of those pivot points, and work your way back up. And the worst-case Big O notation for this is

still n². But the average performance on a random list

is n log n. I’m not going into the mathematics, but

suffice it to say that is a lot, lot better. And then, on the other end of the scale, there is bogosort, which is designed

as a joke. Bogosort is really simple: randomise the list. Is it sorted? Great! If it’s not, randomise again

and keep going until it’s sorted. The average performance there is O((n+1)!). Factorial. You can draw a graph of all the common

Big O notations to compare them. But the important thing is:

the Big O of whatever you’re coding is going to be mostly affected by the

worst one of those that you have anywhere in it. If your code does ten operations that are

linear and one that’s got a Big O of n²: well, your program is n² now. Remember it’s just an approximation, it’s a guide for humans to work out which

algorithm is best to use. Despite being about computer science,

it’s a little bit fuzzy. So what should I have done with my search

system, back when I was a kid? Well, I could have used code that was

already out there, realised that other people had done the work

for me. That’s true. But ultimately what I should have done is

what the client asked for: typed the descriptions into a word processor

and saved it as a document. Because that was a better solution: they could have just searched it by pressing

Control-F. Text doesn’t break down or need updating

when an operating system gets upgraded. If they had new photos to add,

anyone who knew how to type could update it. They could still open that document today. The big problem wasn’t that I used

a bad algorithm: the big problem was that I was ignoring what

my users actually needed because I wanted to show off

how clever I thought I was. It’s important to think about

how fast things will work, sure, but the best solution isn’t always the fastest,

or the smartest. It’s the one that works for everyone, long-term. This series of The Basics is sponsored by

Dashlane, the password manager. And there’s one aspect of password managers

that I haven’t talked about in these sponsored sections yet: protecting you against phishing attacks. This is a slightly embarrassing story, but

let’s say you’ve signed up to Dashlane, which, incidentally, you can do by going to

dashlane.com/tomscott for a free 30-day trial of Dashlane Premium. And let’s say that at some point later on, you’re exhausted and tired and you click

a link to view a Google Doc that someone’s shared with you,

and you get asked to sign into Google, and you just assume that your session has

timed out for security as it does sometimes, so you type in your email address. I’m saying “you” here, but this literally

happened to me because I was exhausted and

not thinking straight. So, let’s say I did that. And then my password didn’t autofill. Because the domain name wasn’t right. It wasn’t a Google page. And autofill checks the domain that you’re

logging into. If the domain doesn’t match, it won’t

type your password in for you. And that was the clue I needed to wake up

and realise that, actually, that wasn’t a real login link and that was

a really close call. The people trying to pull off a phishing attack

only have to fool you once. Now, password managers aren’t foolproof,

there’s always someone more foolish, I could have assumed something was broken

in the software and tried to copy-and-paste my password instead, but it was enough of a red flag, enough of

a warning, that I saw what was happening. So: dashlane.com/tomscott for a 30-day free

trial of Dashlane Premium, which includes unlimited password storage

and sync, plus a load of other features. And if you like it, you can use the code “tomscott”

for 10% off.

Dashlane continue to be a great sponsor for these videos! You can get their password manager for free on your first device at https://www.dashlane.com/tomscott – there's a 30-day free trial, and 10% off with my code, "tomscott"!

How do you look old and young at the same time?

Flashbacks to my CS degree. I prefer using .Sort()

so basically in this video i learned more about how efficient certain sorting algorithms are than in one whole year of IT class three years ago

and then someone creates radix sort, and in that moment the nightmares started to appear

Tom was trying to make A.I. to sort some pictures

If you are interested in sorting algorithms and are also looking for a YouTube sinkhole to spend your awake nights in, I recommend "sorting Hungarian folk dance". There's a couple of different algorithms and they are all dynamite.

7:13 – 8:05 is perfect

Wait, so what is the best sorting algorithm though?

PIVOT

I cable managed my computer with Bogosort.

My Lazy Sort algorithm is O(∞), and doesn't even require any code.

Reminds me of the computerfile title for a similar video

I think saying that "that runtime is usually shorter to begin with" in this context ( 5:44 ) implies that because it performs better in best- and average-case, the worst case is better for something like insertion sort over bubble sort, instead of emphasizing that Big O categorizes by worst-case. It might be better to state that when each algorithm is doing the most work possible to sort, they will run in the same amount of operations (relative to input).

Overall great video though, I appreciate the effort to make Big O more palatable, I would've appreciated it when I was learning it.

7:42 Using a simple CSV file would've been better because your data would've been technology independent. You can search through it with Word, VSCode, Notepad++, grep, Select-String (that's from PowerShell), heck even classic notepad would work!

But I'm just nitpicking.

This guy only have one t-shirt

Great intro to big O. Simple to understand and with nice visual aid. I have you do more programming basics videos.

I learned Insertion Sort differently – one that is O(n log(n)): Rather than going down the list and comparing and swapping every item, instead you take advantage of the partially sorted list and find the insertion point using a hi-lo search algorithm, and insert it there directly. The hi-lo search algorithm is O(log(n)) and you have n items, so it takes O(n log(n)). The hi-lo search algorithm is recursive, starting with your sorted list, pick the item in the middle and compare to your item to insert – is your item higher? then your new search area is the upper half, if lower, then the lower half, repeat halving your search space each time until you've found your insertion point. Yes, it does require a "shift" or "move" operator to insert, unlike most other sorting algorithms that only require a swap operator, and if your list is a physical array, such that the shift/move requires O(n) time, then it can still end up O(n^2) time – most really large lists are optimized in blocks, such that such shifts take only O(log(n)) time. Even when that's not the case, it is the comparison that really takes the most time: test dependent branching is slow.

Adult me: Eh, at least you could do a bit of coding at that age!

Teenage me:

*chortles every single time Tom said "Big O"*In fairness, you say Dashlane not autofilling is a red flag, but Dashlane completely breaks regularly and doesn't autofill. Sometimes you'd be lucky if you could then go and copy your password because odds on the main app has crashed randomly and won't reopen. 1Password is far more reliable.

If the pinned comment is not 5 days ago, it's a bootleg Tom Scott

the one that works for everyone

long term

magaling-magaling

Young programmer: How can I sort a hundred items in my custom list in about a minute?

Senior programmer: How can I tweak this legacy database to keep ten million items sorted without impacting performance?

"work experience" is a thing here in the states too. It's usually a full semester or year though.

🌎

This video basically mentions how scene groups release game cracks

oh god where was this vid when i was doing my finals from sorting algorithms,,,

I had work experience last month at a music store…

Best sort of all times: Sleep Sort 🙂

To be fair you are somewhat cleaver maybe not as cleaver as you think but still some what cleaver

Knowing about Big O and runtime and so on is one of those things in computer science that you hardly ever really "use" (and it gets fuzzy really quickly if you think about it too hard) but knowing about it makes you think about problems in a totaly different way. And I like the remark at the end about premature optimization 🙂

I wish I had this video when I was learning this stuff in school.

Wait… are you claiming that customers actually know what they want? 😅

Ah yes, work experience. That thing where I chose ten local companies: graphic design companies, art suppliers. libraries, publishers… and was sent to Sainsburys to stack sodding shelves.

The best solution is the one that works for everyone longterm… So the exact opposite of what every major tech company is doing since 2015.

2:33 – You don't need to check the last element on the second pass. Or the last two on the third pass, etc.

There’s a video of people doing a dance showing some sorting algorithms

As a former teacher I can confirm the two weeks when an entire year group of snotty kids, I mean delightful teenagers, was out on work experience was like a mini-holiday. And yes, I know, as a teacher I had plenty of holiday anyway. The payback was every teacher had to go and visit at least two of the pupils at their placement to see how they were getting on, but that was also a rare trip out of school and meant we could get some shopping in or have a coffee.

I find that most people who have not studied computer science or data structures and just learned to code write very inefficient code because modern computers are fast enough to hide the inefficiency until scaled up

Big Oof Notation

as people in my country are trying to give kids work experience tom is already tired of it

Bogosort is the troll of all sorts

Good introduction to time complexity. I would however have liked if you went a bit more in depth on Big-O notation. You mentioned that O(n) + O(n^2) is O(n^2) is great, but also that O(1000*n) is still just O(n).

Or descriptions could've been put to JPEG metadata instead of a document, that way they would always be attached and never go out of sync (unless someone copies the files).

And then if needed, that document could've been generated from that.

But that's an entirely different, irrelevant story 😉

P.S. Assuming that images were in jpeg format…

Once in my life I was in a very similar situation and I went down a very similar path resulting in… oh boy I screwed.

When I see a bogo sort

YIKES

This was a nice refresh of one day lesson to my freshman computer science classes. Honestly, I’m surprised I forgot about this, and this was explained just as well as my favorite professor minus the fun group of people showing a visual example of how each sort function works. Great video!

No love for bogobogosort?

I'm saddened

so the what the customer asked for is an index, and you made a sorting algorythm instead

When I was in high school, I was in charge of sorting in the largest online image gallery of a certain TV show that was popular, but also slightly obscure enough that we could be completely sure that it was in fact the largest gallery of images from that show and it's corresponding merchandise on the web at that time.

I created folders and subfolders, probably the most folders. Nothing had tags, so I had to organize one by one into the correct folders and subfolders that I came up with. I could move files that were sent to my section and create and delete folders, but not change any base code. I was excellent at it and thought about what would be most efficient as I sorted away.

This wasn't for school or anything and I'm sure it'll help everyone sleep at night knowing that the gallery was last forever after it was put on temporary hold while a Google search consultant was organizing a site revamp and the owner died. I spent two or three years of my free time on that.

What does sorting have to do with images? File names are already sorted by the operating system.

My computer science teacher showed this to us today and tonight it showed up in my recommended.

I have mixed feeelings about this video, did you just tell me not to program my own things?

Definitely considering using Bogosort in my application now haha.

Not as much fun as I thought “big O negation” would be…

Sleep sort is the best sorting algorithm FACT.

I know the background is cool, but you could also gree screen in footage from the museum and film/record in a less echoy room. Just an idea. It is not to the point where it is bad, just a bit distracting.

The fact you do these videos in one single shot is why I enjoy watching them so much. That, and the content is always interesting!

Very nice we have done the exact same thing today at school.

to be fair, trying out programming when it wasn't needed, but your other work efforts weren't critical to paying rent etc yet might've be the thing that got you interested in learning more about it, leading up to today where you explain it to us. You tried out something new and you learned something and I think that's better than doing a mundane, unpaid job the way they wanted you to.

Programming: You get really good at something, you managed to become a leader in your field in terms of performance, optimisation, accessiblity, etc, then some idiot invents a front end library that everyone starts dumping business logic into and you're screaming so loudly internally that you burst. If you think I'm an idiot for dissing React now, wait till it happens to you too. Also, stop using React. Thanks.

The ones blessed with the Youtube recommendations knows that LSD base 10 sort is the best

Don't worry I'm german and my Kot smells terrible

I am bogosort

Good takeaway. "The best solution is the one that works for everyone, long-term." This is something I could start employing in my own practice and I'm not even a programmer.

Great video! Thought you should know you're doing bubble sort wrong, though. For every iteration N on a list of size X you only check X-N items because you know that the top N are in their correct positions. It's still O(n²) but it requires half as many operations as you claim in the video.

great ^^

The biggest issue i struggle with as a CS student is trying to come up with the " perfect " solution , while there is none , and ignoring the simplest solutions , just because they seemed " Too easy "

at 15 when you were doing all that I was still figuring out how to eat ice cream by not dropping it on my shirt. Absolute Genius

My work experience (in Finland) was at an insurance company. They gave me the menial job of typing addresses on envelopes using _a typewriter_. So many envelopes. I actually managed to break one of the letter arms.

There is an oversight/improvement in the animation of Quicksort. You don't need to look at any pivot element ever again after splitting the list because you know they are in the correct position

Heard about big O notation years ago and I never understood it at all until now. The wikipedia page on it is cryptic af.

How come you explained that clearly in under ten minutes but it took my professor at university an entire term and was also ridiculously expensive?

"i flew a kite in a public place"

If you look closely, Tom Scott looks like a K-pop star

No…Tom, how…you don't need to check the entire list on each cycle of bubble sort. You just said, that after one cycle, the biggest object is at the end of the list, so you obviously don't need to check that one again. With each cycle you just reduce the cycle length by one, because every last object of a cycle is then already in its right place, then you terminate if you make a full cycle without any swaps. That way you cut down the run time by about half. I do the same thing with my implementation of selection sort.

I mean, it's still n², but still, mistakes like that can break a system

“So the teachers can have a break” 😱😱

Wait, if you got assigned the job of writing descriptions, where did the descriptions you searched come from?

You just explained Big O Notation better than any of my Comp Sci profs in university

I love how the graph for bogosort i just a straight up a vertical. pun intended

For my work experience I went and worked in a local Infant School…

It was awful

this had a great shyamalan-esque twist

IE: UX is king

Now the question is. Did you learn the lesson of doing what the 'customer' asked during your 'internship'? because then it was one of the most use full internships you could have done. And the company gave a future employee really use full information. If you learned it later is was still use full.

As for opening old word files with images today, I wouldn't count on it. A

lotof them have been corrupted throughout various Office updates. I recently hard to help out at a university that lost a significant amount of data because of this :/ File names should definitely be descriptive as well.3:43 Donald Kenobi

Yes, Bubblesort is O(N^2), but you can avoid half the work by stopping earlier in each pass. And if your data is nearly sorted already, it is O(N) and often outperforms quick sort. For small input bubblesort might also be faster, and data/cache locality can make a big difference. Definitely start with the system quick/heap sort, but if performance really matters (and it rarely does for most things) test it on real data and

documentthe assumptions you made in choosing the algorithm.But definitely don't write new code if there's a perfectly good way to do it with existing tools. 😁

The moral of the story here is very relevant

Reminder that this video is basically one take. Insane.

Just saw you on the Christmas lectures which is super cool

Agreed! I wish Microsoft knew that… If they did… I'd still be using Windows! Linux is now the greatest: free, superior, flexible, and on average much much faster!

This is really good! Gonna show this to my programming students.

Insertion sort as commonly implemented only does the swap-to-insert dance AFTER it has found the final resting place of the item it's checking. An optimization on it can use binary search to find the correct position of each element a little quicker. It's called INSERTION sort because it inserts new data into an already sorted set.

If you checked the size of each block then sort them in order of height would that be O(n)?

If you checked the size of each block then sort them in order of height would that be O(n)?

Bogosort is the drunk relative stumbling into the house at ChristmasThis is an incredibly well timed video for someone going into a second semester programming course starting yesterday

thanks for reminding me i have to find a work experience placement…

I disagree with your conclusion. I wish the world was full of people that would make some algorithm besides doing the stupid task. It shows creativity, it let's you learn. You wouldn't have the youtube channel you have today if you weren't like that back then! Curiosity and exploration are gifts.