Back-end Engineering Articles

I write and talk about backend stuff like Ruby, Ruby On Rails, Databases, Testing, Architecture / Infrastructure / System Design, Cloud, DevOps, Backgroud Jobs, some JS stuff and more...

Github:
/danielmoralesp
Twitter:
@danielmpbp

2024-10-29

Ruby Hashes

So far we’ve seen different data types in Ruby, like Strings, Numbers, Booleans and Arrays. Now it’s the time for the next data type: Hashes. 

Arrays
A Ruby hash is a data type similar to Arrays, but instead of being zero indexed, we as programmers can change the index value. Remember that Arrays are zero indexed, and Ruby does that behind the scenes for us. We cannot change the number of that index, for instance, we cannot start the index of an array in 10, it will always start at zero. Always. So if need for some reason to change the value of that index, or you want to have other way to identify the elements inside the data structure, you can use a Hash

Let’s see again the image about the arrays


With the hashes in Ruby you can assign an index and then give it a value

2.6.8 :120 > array = [1, 2, 3, 4]
 => [1, 2, 3, 4] 



Hash
This the example of a Hash


What did you notice as different? Let me explain the hash syntax and structure

  • - Now we have an open curly brace, instead of a square bracket. Important difference here in syntax
  • - Then the first element of the hash has this structure: Key:Value. The key in a hash is what we can identify as an index in a ruby array. But now we can manipulate it and assign the value we want, in this case a String. After the key we have the colon punctuation “:” and after that we have the value (which is similar to the elements in an array)
  • - Then we have all the key:value pairs separated by a comma
  • - Finally we have the end of the curly brace.


2.6.8 :121 > hash = {"first": 1, "second": 2, "third": 4, "fourth": 4}
 => {:first=>1, :second=>2, :third=>4, :fourth=>4} 


Did you notice something weird in the output of the last line of code?
Look at this output: {:first=>1, :second=>2, :third=>4, :fourth=>4} 

We have declared the keys as strings, but ruby is returning the keys as “Symbols”. 

Ruby Symbols
We currently use the syntax of the colon “:” but in previous versions of Ruby people used a lot of other syntax to separate keys with pairs. The technical name for this is hash-rocket and is denoted like so: “=>”




But what’s a symbol? A symbol looks like this:

2.6.8 :122 > :im_a_symbol
 => :im_a_symbol 
2.6.8 :123 > :hello
 => :hello 
2.6.8 :124 > :one
 => :one 


We can identify easily a symbol because it start with a colon “:” followed by a string

Some people confuse symbols with variables, but they have nothing to do with variables. A symbol is a lot more like a string. So what are the differences between Ruby symbols & strings? Strings are used to work with data. Symbols are identifiers. That’s the main difference: Symbols are not just another kind of string, they have a different purpose.

Symbols look better, they are immutable (it cannot be modified)and if you benchmark string keys vs symbols keys you will find that string keys are about 1.70x slower (see more here). However if we change the key from a string to a symbol, we have to use the hash rocket notation. So we can change our first hash to be something like this



2.6.8 :125 > hash = {:first => 1, :second => 2, :third => 4, :fourth => 4}
 => {:first=>1, :second=>2, :third=>4, :fourth=>4}


So we now have the same input and output. Then given the fact that Symbols has a better performance in comparison with strings as a key inside the hashes, we’ll prefer the usage of the symbols

Key:value pairs
Last figure shows us that the main structure of the Ruby hash is the key:value pairs. They are called pairs because they always come paired. If you forget any of them you’ll get an error. If you don’t want to have a value, you can just put an empty string or a zero.  


Types of Keys in a Hash
So far we’ve seen keys as strings or symbols. The question is, can I have an Integer, a Boolean or other Ruby data types as keys in a hash?

Let’s see with integers

2.6.8 :128 > hash = {0: 1, 1: 2, 2: 3, 3: 4}
Traceback (most recent call last):
        3: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `<main>'
        2: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `load'
        1: from /home/daniel/.rvm/rubies/ruby-2.6.8/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
SyntaxError ((irb):128: syntax error, unexpected ':', expecting =>)
hash = {0: 1, 1: 2, 2: 3, 3: 4}



If we try to create a hash using the syntax key:value and using an Integer as a key we’ll end up with an error. But the error says? Is a Syntax error

SyntaxError ((irb):128: syntax error, unexpected ':', expecting =>)

Ruby is expecting a hash rocket symbol “=>” instead of a colon “:”. So what happens if we keep the keys but we create the hash with has-rockets?


2.6.8 :129 > hash = {0 => 1, 1 => 2, 2 => 3, 3 => 4}
 => {0=>1, 1=>2, 2=>3, 3=>4} 

It will work! Here is the b efinit to use the hash rockets, because it can identify integers as keys inside the hash. Actually this last example is how Arrays are indexed, staring from zero. The other question that can arise is if we can use integers as symbols? Let’s try it

2.6.8 :130 > hash = {:0 => 1, :1 => 2, :2 => 3, :3 => 4}
Traceback (most recent call last):
        3: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `<main>'
        2: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `load'
        1: from /home/daniel/.rvm/rubies/ruby-2.6.8/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
SyntaxError ((irb):130: syntax error, unexpected tINTEGER, expecting tSTRING_CONTENT or tSTRING_DBEG or tSTRING_DVAR or tSTRING_END)
hash = {:0 => 1, :1 => 2, :2 => 3, :3 =...
         ^



So the answer is in the error itself. Ruby is expecting a string as a symbol. Remember that symbols are similar to strings, but it help us as an identifier in hashes. 

Other question, can we have Booleans as keys? Yes we can, let’s do it


2.6.8 :131 > hash = {true => 1, false => 2}
 => {true=>1, false=>2} 

But what will happen if we repeat a key with the same name?

2.6.8 :132 > hash = {true => 1, false => 2, false => 3}
 => {true=>1, false=>3} 


It will take just the last declared element, same thing happens with symbols (even we’ll receive a warning)

2.6.8 :133 > hash = {:first => 1, :second => 2, :second => 3}
(irb):133: warning: key :second is duplicated and overwritten on line 133
 => {:first=>1, :second=>3} 


Can you see the beauty of all of this?


Accessing hash values
Now we probably want to access the values of our hash. What we can do is to call the key and then we’ll see the value printed out

Note: please take care about what we’ll do here, because is a bit complex and mix the knowledge we already have about hashes

Let’s diggest line by line

hash_one


2.6.8 :152 > hash_one = {:first => 1, :second => 2, :third => 3, :fourth => 4}
 => {:first=>1, :second=>2, :third=>3, :fourth=>4} 
2.6.8 :153 > hash_one[:first]
 => 1 
2.6.8 :154 > hash_one[:second]
 => 2 


Here we’ve declared a hash with symbols and hash rockets. So when we want to access to a value, we have to call it similar to an array, but instead of passing the position/index of the array, we’ve to pass the key of the hash we want to retrieve

hash_two
2.6.8 :163 > hash_two = {'first' => 1, 'second' => 2, 'third' => 3, 'fourth' => 4}
 => {"first"=>1, "second"=>2, "third"=>3, "fourth"=>4} 
2.6.8 :164 > hash_two[:first]
 => nil 
2.6.8 :165 > hash_two['first']
 => 1 


Now we’ve declared a hash with strings as keys and with hash rockets. So we’ve to take care the way we call any value, because if we try to call it like a symbol hash_two[:first] we’ll have a nil as a result. So we have to call it as string: hash_two['first']

hash_three

2.6.8 :166 > hash_three = {'first': 1, 'second': 2, 'third': 3, 'fourth': 4}
 => {:first=>1, :second=>2, :third=>3, :fourth=>4} 
2.6.8 :167 > hash_three['first']
 => nil 
2.6.8 :168 > hash_three[:first]
 => 1 


In hash_three we’ve created a hash with strings as keys and with colons “:” instead of hash rockets. But if you see this line of code hash_three['first'] returns nil, but if we call the value with the symbol syntax we’ll get the correct value hash_three[:first] which is something weird, is a behavior we need to take care because we can end up with nil values when we try to return something from a hash declared on this way

hash_fourth

2.6.8 :169 > hash_fourth = {0 => 1, 1 => 2, 2 => 3, 3 => 4}
 => {0=>1, 1=>2, 2=>3, 3=>4} 
2.6.8 :170 > hash_fourth[0]
 => 1 
2.6.8 :171 > hash_fourth['0']
 => nil 
2.6.8 :172 > hash_fourth[:0]
Traceback (most recent call last):
        3: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `<main>'
        2: from /home/daniel/.rvm/rubies/ruby-2.6.8/bin/irb:23:in `load'
        1: from /home/daniel/.rvm/rubies/ruby-2.6.8/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
SyntaxError ((irb):172: syntax error, unexpected tINTEGER, expecting tSTRING_CONTENT or tSTRING_DBEG or tSTRING_DVAR or tSTRING_END)
hash_fourth[:0]
             ^
2.6.8 :173 > 





Finally we’ve the hash which contains integers as keys and declared with hash rockets. We can retrieve the information as an array, just passing the integer key we want to return hash_fourth[0], but if we try to return it as string we get a nil value hash_fourth['0'], and if we try to retrieve it as a symbol we get an error. Thats other weird behavior we have to take care

Adding element to a hash
As with Arrays we can start empty hashes and then start adding elements to it. The ways we can to these two things are very simple

2.6.8 :180 > new_hash = {}
 => {} 
2.6.8 :181 > new_hash[:first] = 1
 => 1 
2.6.8 :182 > new_hash['second'] = 2
 => 2 
2.6.8 :183 > new_hash
 => {:first=>1, "second"=>2}


Let’s explain
  • First line of code creates an empty hash with the syntax realist to Ruby hashes: “{}”
  • Then we started adding new elements. First element added we did it with the symbol syntax, given the key and then assigning the value: new_hash[:first] = 1
  • In the last line we did the same thing but using “string” as a key for the hash. new_hash['second'] = 2
  • Finally we printed out the final hash

Modifying hash values
Once we have a hash we can modify the internal values like so

2.6.8 :186 > new_hash = {:first=>1, "second"=>2} 
 => {:first=>1, "second"=>2} 
2.6.8 :187 > new_hash['second'] = 3
 => 3 
2.6.8 :188 > new_hash
 => {:first=>1, "second"=>3}


We’ve created a new_hash and then we get the key ‘second’ like this: new_hash['second'] and after that we assign a new value, in this case the Integer 3. When we printed out we sw the new value assigned to this key

Deleting hash elements
We can also delete hash elements, in this case using an espacil method from Ruby called “.delete”

2.6.8 :189 > new_hash = {:first=>1, "second"=>2} 
 => {:first=>1, "second"=>2} 
2.6.8 :190 > new_hash.delete(:first)
 => 1 
2.6.8 :191 > new_hash
 => {"second"=>2} 


The “.delete” method receives the key we want to delete, in this case (:first). And then when we print the result we’ve just one element now

Iterating over a hash
As with Ruby arrays we can iterate over hashes. But this time we have to take into account each key and value pairs and use the “each” method we saw in a previous post

2.6.8 :201 > hash = {'first': 1, 'second': 2, 'third': 3, 'fourth': 4}
 => {:first=>1, :second=>2, :third=>3, :fourth=>4} 
2.6.8 :202 > hash.each do |key, value|
2.6.8 :203 >     puts key, value
2.6.8 :204?>   end
first
1
second
2
third
3
fourth
4
 => {:first=>1, :second=>2, :third=>3, :fourth=>4} 




Let’s explain these lines of code
  • - First we call the variable who has the hash assigned
  • - Next to it we run the method “.each” to iterate over it
  • - After that we open the Ruby block with the keyword “do”
  • - At the end of the first line we have now two variables inside the pipe symbol “|key, value|”. The names don't matter, what matters is the position of each one, because the first will always save the key and the second the value. It’s quite similar to the array iteration, the only difference is now we have 2 values that we can manipulate (keys and values)
  • - Inside the body of the ruby block we have a “puts” keyword to print both of them: key and value
  • - Finally we close the ruby block with the keyword “end”

Very interesting isn’t it?

Other Ruby hash methods
This is the last thing we’ll be doing with hashes. We have a list of built-in methods to do common tasks with hashes. We’ve a lot, so if you want to know the exhaustive list, here you can go an in the left panel you can see all the methods. https://ruby-doc.org/core-2.5.1/Hash.html


But let’s see some important ones

2.6.8 :215 > hash = {'first': 1, 'second': 2, 'third': 3, 'fourth': 4}
 => {:first=>1, :second=>2, :third=>3, :fourth=>4} 
2.6.8 :216 > hash.length
 => 4 
2.6.8 :217 > hash.has_key?(:first)
 => true 
2.6.8 :218 > hash.has_key?(:tenth)
 => false 
2.6.8 :219 > hash.keys
 => [:first, :second, :third, :fourth] 
2.6.8 :220 > hash.values
 => [1, 2, 3, 4] 

Let’s check line by line
  • - First we’ve created the hash
  • - Then we asked for the length of key:paris and we got 4
  • - Then we asked if the hash has a key with the symbol “:first” and the result was true
  • - Then we asked if the hash has a key with the symbol “:thent” and the result was false
  • - Then we ask for the keys and we got the key names
  • - Finally we’ve asked for the values and we got the value names


Pretty awesome!

Hope you learnt a lot about hashes

Thanks for reading

DanielM