Image Steganography and steganalysis
During my preparatory classes worked on a project that I presented at the oral examination. It was one of my first encounter with the world of research in mathematics. I worked on an extremely interesting subject that I did not even suspect existed. The goal of the project was to hide information into an image and being able to decode it. The principle behind this process is called steganography.
Steganography
Definition
Steganography is the practice of concealing a message within another message or a physical object. The main idea is to hide a message into another one.
For instance, let’s take a look at this simple encoded message :
Now, let’s take a look at this image :
Image
A colored image is a 3-dimensional array or if you prefer 3 2d arrays. Each array represents a channel which can be Red, Green or Blue. The values in this array are between $0$ and $255$. You can think of it as a mixture. A pixel represents a color and is a mixture of a proportion of red, green and blue colors.
In a colored image, keep in mind that each pixel is a triplet and this triplet represent a color. For example the triplet $(255,0,0)$ represents the color red, the triplet $(0,255,0)$ represents the color green, the triplet $(0,0,255)$ represents the color blue, the triplet $(r,g,b)$ represent an unknown color.
Least Significant Bits steganography : Principle
With that in mind we can use a steganography technique called Least Significant Bits(LSB) to hide a message in our image. How would it be possible? We said that a pixel is a triplet. When we consider one dimension the values are between $0$ and $255$. So in total there are $256$ possibilities, which is also equivalent to $2^8$ possibilities. In computer language the values can be encoded in $8$ bits.Here is an example with $20$.
The LSB technique is based on the weakness of human eyes. We can barely make the difference between pixel colors when each component’ of the pixel value varies between 0 and 7. Here I represent the color red. I decreased the red value component seven times and I plot the result. The same thing was done for the color green and the color blue.
The tape should look uniform to your eyes even if it is not. That’s the trick we will use here. This variation is equivalent to changing the least significant bits.
Here is how we can encode a message in our image. Let consider a component of a pixel which is $\overline{11001000}^2$. The least significant bits are considered here $000$. The message is $110$. In general the message can be anything that can be represented in bits. Here it will be the most significant bit of the image we want to hide. We just replace the least significant bits by our message in original component. We then get $\overline{11001110}^2$. This is the encoded image. Decoding after is just straightforward. Get the message bits and add some random bits behind.
Why do we only use the 3 last bits you may ask? I tried with 4 bits and although we get a better-quality decoded image at the end the encoded image show to images superposed. What image do you see behind that cat? To keep the encoded image undetectable with eye we must use the 3 least significant bits.
Least Significant Bits steganography : Results
Now let me show you some example : We have a bike image on the left. We add some text in message and on the right, you can see the same image but encoding the text. Looks the same, right?
Here is another example with our cover image, the image to hide, the encoded image and the decoded image. The decoded image does not have the same quality as the original one, but we still have something visible and it is enough for what we want to do. We now know how to hide an image into another one.
Steganalysis
If you make it up here, thank you. You just learn a powerful technique. At least it is powerful if the message or its presence is not detected. Why am I saying that? Because there is something called Steganalysis.
According to Wikipedia, Steganalysis is the study of detecting messages hidden using steganography; this is analogous to cryptanalysis applied to cryptography. We will learn a simple technique of steganalysis. I do not know if it has a name but this is the idea behind it. Let consider our bike image. Here is a component of a pixel in the image $\overline{11001000}^2$. We just reverse this component to get a new component $\overline{00010011}^2$ .
When we do that with the original image we get this image. It is mainly made of random noise. But when we apply this to the encoded image we get this new image. We can clearly see that something is not right on the left part of the image. Even without knowing what is hidden we can say that something is hidden.
It just get worse when we hide an image in another. Here we have a cover image and the image we want to hide. By applying the steganalysis technique to the encoded image we get this image. It is clearly almost the hidden image.
The technique of steganography we just learnt before is not very powerful when your message is seen by someone knowing about the good steganalysis techniques. There are various techniques of steganography out there and a lot to learn.