Back-end Engineering Articles

I write and talk about backend stuff like Ruby, Ruby On Rails, Databases, Testing, Architecture / Infrastructure / System Design, Cloud, DevOps, Backgroud Jobs, some JS stuff and more...

Github:
/danielmoralesp
Twitter:
@danielmpbp

2024-12-03

Introduction to Logarithms in Data Structures

In the last blog post about Big O Notation we talked about three basic notations like O(1) - Constant Time, O(n) Linear Time and O(n**2) Quadratic Time, to describe the time and space complexity of a given algorithm. 

However, at the end of the article we talked about other types of notations that help us do the same but for other kinds of algorithms. Actually I shared my personal cheatsheet about Big O Notation


As you can see we have 2 other notations related with Logarithms (please don’t confuse Logarithm with Algorithm). That notation is described as log(n). Now we’re going to focus just on O(log(n)) Logarithmic Time

O(log(n)) - Logarithmic Time
This notation can be pretty scary at the beginning, but I’m going to do my best to explain at a high level what this means. 

The first thing we have to check is the graph and it looks like this


If you follow the blog post mentioned above, you can see that a problem solved logarithmic time complexity is a good one, because when we have more values in N (or volume of data in the array) the execution time tends to stabilize even if we have a bunch of new data. That's the magic behind the logarithmic solutions

Understanding Logarithms
I think the easiest way to explain this concept is as follows

What are the results of the next operations in Ruby? (open you IRB console and test by yourself)

➜  ~ irb
2.6.8 :001 > 2**1
 => 2 
2.6.8 :002 > 2**2
 => 4 
2.6.8 :003 > 2**3
 => 8 
2.6.8 :004 > 2**4
 => 16 
2.6.8 :005 > 2**5
 => 32 
2.6.8 :006 > 2**6
 => 64 
2.6.8 :007 > 2**7
 => 128 
2.6.8 :008 > 2**8
 => 256 



Let’s figure out what’s happening here
  • - All the operations were made by calculating the 2 to the power of 1, 2, 3, 4, etc.
  • - All the exponentials change just by 1. So, we started with 1, then 2, then 3 and so on. So the change is just by 1
  • - If you see the results of each operation we have other math facts. 

  •   - First result == 2
  •   - Second result == 4 (which is the double of the last result)
  •   - Third result == 8 (which is the double of the last result)
  •   - Fourth result == 16 (which is the double of the last result)
  •   - Fifth result == 32 (which is the double of the last result)
  •   - Sixth result == 64 (which is the double of the last result)

  • I think you got it!
  • The conclusion is that we’re changing the exponential just by 1, but the result is the double of the last operation
This is why the graph we saw before tends to stabilize even if we have big numbers, because the exponent change just by 1, let’s see the graph explaining this effect


Logarithmic formula
Once we understand the result of the logarithm is easier to see the math formula



Note: in computer science and to check the time and space complexity with this notation, we’ll be always using logarithms base 2. (not base 10 or others)

If you want to experiment with other values, you can always check for calculators in internet (or do it by yourself), like this one: https://www.omnicalculator.com/math/log-2


How can we translate this to code?

Now the idea is to see what kind of problems in coding are logarithmic. Let’s imagine an array of 100 elementos, from 1 to 100. In that case the formula is expressed by 


log2(100) = ?

If we do this operation using the internet calculator shared before, the result should be 6.644. This means that in just 6.644 iterations we’ll resolve the problem!

But how is this expressed in the code?

A common problem that has this kind of behavior are searching operations. For instance, we have 100 array elements in ascending order and we are asked to find the number 22 in the array. One way to solve this is called “divide and conquer”, so we cut the array to half and choose the left side of that split. So we end with the first 50 elements. Then we repeat the process and cut it in half, so we end up with the first 25 elements and inside that array we’ll find the number 22. 

Of course we have different ways to solve this problem, but here we’re talking just about searching operations as a solution that meets the requirements to be logarithmic. Later in new blog posts we’ll be solving these kinds of problems. For now I think we have enough theory to understand the logarithmic notation

I hope you learnt a lot

Thanks for reading
DanielM