Tuesday 5 May 2015

Kernel mein (as in hindi) Kernel

Dear Reader,

I've backlogged quite a bit on my blogs and hope to complete them in the next couple of weeks. Every time I sit to complete a blog, I'm flooded with topics for a new blog!! This blog is technical and addresses issues in data security.

I will be updating this blog later with some code snippets and samples. But until then, I've only penned my thoughts.

Let me know what you think in the comments section.

Regards,
Jyothin

-----------------------------------------------------------------------------------------------------------------

Introduction

Mobile device security is getting the much needed push by technologist to provide a secure mobile experience to users. Whether it is financial transactions, personal data security, photo security, device security, protection from viruses, etc., it is imperative that one thinks about how secure their lives are with mobile devices becoming extensions of an individuals personality.

There are numerous ways in which a device can be made secure; DaaS (Desktop as a Service), ACL (Access Control Lists), encrypted data transmission, etc. and some of them go that extra mile to ensure device security; double encryption, remote wipes and auto-locks, mobile ID authentication mechanisms, isolated special subnets for mobiles, signal range control, etc.

The Probability of 'Safe'-keeping

In a non digital scenario, when an average person wants to safe keep items that are dear to them, the most likely way to do it is to install a safe in their house and secure it with a password or number lock. Sounds fine and work pretty well in most cases. Lets call this safe the outer safe.

Lets for a moment assume that the probability of someone breaking into the safe and stealing your items is x. Hence the probability of someone accessing your data is x.

Lets also assume that, you feel that x is not small enough for you. So you decide to put another mini-safe within the outer safe and also secure it with a password or number lock or any other better locking mechanism. Lets call this mini-safe the inner safe. Now the probability of someone stealing your outer safe is still x but the probability of someone accessing your data is not x.

Assume again that the probability of someone managing to break open the inner safe is y. By theory of probability the probability of someone accessing your items is x times y.

With only an outer safe:
P[break opening the safe] = x
P[accessing your items] = x

With an inner safe inside the outer safe:
P[break opening the outer safe] = x
P[break opening the inner safe] = y
P[accessing your items] = x * y

For any values you choose for x and y; P[accessing your items], when there is an inner safe is less than P[accessing your items] when there is only an outer safe.

For example, if x = 0.5 and y = 0.5,
With only an outer safe:
P[break opening the safe] = 0.5
P[accessing your items]   = 0.5

With an inner safe inside the outer safe:
P[break opening the outer safe] = 0.5
P[break opening the inner safe] = 0.5
P[accessing your items]         = 0.5 * 0.5 = 0.25

What's a 'Safe' in the Digital World?

A safe is an fixed sized enclosed space that stores your items and requires a key to unlock. In the digital world your items are data (bytes of information) stored in a fixed sized memory and a safe would be a means of protecting that memory space.

In operating systems, especially operating systems based on the Linux kernel, a user typically runs processes/tasks or reads/writes to memory in user space. The operating system itself runs in protected mode in kernel space. A user requests the operating system for access to resources in kernel space. Resources could be I/O devices, protected memory space, processor resources, etc. Requests are usually made using drivers which in turn use IOCTLs to provide custom functionality.

A user cannot access kernel space resources directly. So, in a way, one can think of everything in kernel space as being put in a safe (the outer safe) and access is restricted to users who have the root password OR via drivers. If there are no drivers then the only way to access resources in the safe is by the root.

For the purposes of this discussion, lets assume the resource we are interested in accessing is data in a fixed sized memory space in kernel space. Using the same idea described in the previous section if we can install an inner safe within the outer safe, in this case a kernel within a kernel, the probability of accessing data within the inner kernel is reduced and hence a means of increased security for the data. A protected mode within a protected mode!!

Of course, we would need a means of accessing the inner safe from the outer safe by means of a driver. In a mobile device, everything is stored as bytes of data and drivers can be loaded/unloaded as required thus adding another layer of security. Further, there is nothing stopping the driver from 'loading' the inner kernel at run-time. So the driver only loads the inner kernel with required, accesses the protected data in the inner kernel and shuts it down when done. The inner kernel is modified for every modify/write of data into its protected memory. So, in affect the inner safe does not exist unless it is required to access the data!

It should be possible to create multiple such independent inner safes to protect different items and load them at run time.

Isn't virtualization the same thing?

No. Simplistically put virtualization techniques typically allow one to virtualize hardware resources in order to run different operating systems on the same hardware so that, you can install and run different software applications on the same hardware and hardware resources, and everything is managed by the virtualization environment.

The aim of a kernel within a kernel is to provide an added level of security to data which is already running in protected mode.

Can I put another Safe inside the Inner Safe?

Sure you can, but the marginal advantage of having more than two safes would eventually diminish to 0 unless the locking mechanism at each inner safe is completely different from the other safes. Essentially making it difficult for anyone to break open each safe (as the locking mechanism at each safe is different).

READ vs. READ, MODIFY, WRITE

Breaking into an inner safe will allow one to read the contents of the inner safe. Modifying the contents is a totally different activity. However, once read the data is compromised. In case of an inner kernel, breaking into an inner kernel will potentially allow one to read the contents of its protected memory, modify it and write back. READ-MODIFY-WRITE, is easier in the digital world.

Moreover, creating the inner kernel would require one to hard-code certain kernel data that is unique to that version of the inner kernel and also ensure that that uniqueness persists even when the kernel is shutdown.

An Inner Safe cannot be duplicated but an Inner Kernel can...

Hmm... I guess it can.

Conclusion

I'm no expert on data security nor on Linux Kernel Internals and unless I make an attempt to write a driver to access an internal kernel, the intricacies and challenges involved in implementing it are beyond my own understanding.

If you are really paranoid about securing your data, take a space flight to the moon, find an undisclosed uninhabited area, dig a hole, bury it, and don't tell anyone.

Thursday 23 April 2015

What's your cloud footprint?

Dear Reader,

How are you? It has not been a very wet monsoon last year in India although the clouds tell a different story in Bangalore (This sentence is now obsolete as the monsoon season and the winter season have passed and it is summer at the time of finishing this post). So do cloud technologies and cloud based applications. There have been numerous announcements, press releases, software stacks, APIs, etc, around cloud based technologies and more are happening every day and very quickly. Personally, I think cloud based applications have a huge potential for the right people albeit with some, but not impossible speed bumps.

When I come across a cloud based application, my first instinct is to weight the pros and cons of using a cloud based service. And to better understand my own "cloud preferences" I set out to find my cloud footprint

If you feel there is a better way to estimate ones cloud footprint. Let me know in the comments section.

Regards,
Jyothin

---------------------------------------------------------------------------------------------------------------------------

What is 'Cloud Footprint'?

Let's first define two terms that I will be using extensively in this blog,
       cloud, in the cloud
       [any data that you cannot physically move yourself but can be accessed (read/write) from a reliable network connection]
       going all cloudy
       [implies a shift to to an all "in the cloud" approach]

Terms analogous to the above are clear, in the clear, going all clear.

Going completely cloudy, implies all your data is in the cloud and a network connection is essential. Going completely clear, implies all your data is physically accessible to you at all times and you do not need a network connection to access it.

By the above definitions,
  • Public email services like Gmail, Yahoo mail, Hotmail, etc. constitute data that is in the cloud
  • Your office email (not based on a cloud email service) is not in the public cloud but more like in a private cloud
  • Your Facebook, G+, Twitter, Instagram, Pinterest, Reddit posts and data are all in the cloud
  • Content, such as photos, images, audio, video that is accessible from a public network is all in the cloud
  • Your personal storage on laptops, HDD, etc. are all clear
  • Your search queries on Google, Bing, etc. are in the cloud
  • Home videos on DVD, VHS tapes (if you still have them) are all clear
  • Notes you take on paper, documents, copies of documents, etc. are all clear

       cloud footprint
       [is an indication of your inherent preference for cloud based applications]

It became very apparent to me that network access, specifically Internet access, is a key component of accessing data in the cloud. You could download all your Instagram photos onto your Desktop PC or laptop and say that you are not using a network to access the data - but then once you've downloaded the photos, they are clear. Moreover, if your data is updated in real-time, you no longer have the latest up-to-date snapshot of the data and hence it is obsolete.

Cloud does not only include your own data by also includes other people's data that you can read. For example, watching YouTube videos also constitutes cloud data, even if it does not occupy your own storage space. The best way to quantify this aspect of your cloud data is by the amount of data that is transferred between you and any cloud based service that you access. By that means, digital broadcast television/radio is also cloud data because technically it can be replaced by a combination of cloud based content delivery services (reading news on a news website vs. watching the news on the TV) and you pay for both.

Hence,
        cloud footprint = function of (
                                     your in the cloud storage,
                                     data transferred between you and a cloud service,
                                     your own personal storage,
                                     physical effort to access your personal storage
                               )

more precisely,
       cloud footprint = Your cloud storage
                          + Your total data transfer
                           - Your personal storage
                           - Your physical effort to access your personal storage

For simplicity, lets assume the everyone's physical effort in accessing their personal storage is 0. Also, note that a 0 footprint value could either mean equal preference for both cloud and non-cloud OR no preference at all!

With that in mind, my own cloud footprint parameters are shown in the sections below.

My Cloud Storage



My Network Footprint




My Personal Storage


My Cloud Footprint

And therefore my cloud footprint,

Therefore, it looks like my inherent preference for cloud based technologies is negative. This makes me an ideal candidate for services to target and increase my cloud footprint. Targeting me alone is not going to lead to any significant increase in the overall usage of cloud services. So, please leave me alone.

However, I wasn't too surprised by my negative footprint as I'm intentionally not quick to upload content quickly. For one, I'm not a social media junkie and tend to limit myself to only important posts on social media sites. I have rarely post or comment on website like reddit, Instagram, etc. and have limited usage of WhatApp and other chat apps.

My initial reaction to calculating my cloud footprint is that a large % of the population probably has a negative cloud footprint.

Non Quantifiable Factors

The debate of whether a cloud based service has more advantages than disadvantages is incomplete without attempting to understand non-quantifiable benefits of using a cloud service. I've listed below some non-quantifiable factors that I think are important using a new cloud service.


Obviously, the importance of the above non-quantifiable factors vary individual to individual. IMO, omni-access is a factor that I personally prefer having in addition to omni-connectedness. I'd also like to have the option to Full-Delete my data whenever I want to but that's quite impossible when your data is in the cloud, but as easy as `$> rm -rf *` on data in my personal devices.

Data security is pretty much taken as default when using a cloud service. Or rather data security is a hidden cost when using a cloud service. Most cloud services guarantee a certain level of data security and assured data access at any time. On the other hand securing your private data is an added effort with additional costs to be borne by an individual. Below a summary of my data security costs.


What are the Costs?

Another way to decide on whether you prefer using a cloud service is to look at the costs associated with your current cloud footprint. Tabulated below is my approximate costs as estimated for my cloud footprint. If you want to take a look at the details of the estimates below or feel something is amiss, drop me a separate note in the comments section.


Clearly, as of my current cloud footprint, its costing me more to use cloud based services than using my personal storage! And it is only going to cost me more to use new cloud based services.

Completely Cloudy vs. Complete Clear or In-between?

If the convenience of omni-access, omni-share, omni-connectedness are your priorities then going all cloudy is the way to go.

Is there an ideal 'cloud' to 'clear' mix? One can't say for sure and it depends a lot on one's personal preferences and more importantly on one's behavioral patterns to consuming NEW data.

With new technologies all based in the cloud one would expect the overall cloud footprint to increase over time. With high capacity consumer memory devices getting cheaper, there isn't anything stopping consumer electronics from increasing memory capacities to store content thereby reducing the dependence on cloud based services.

Again, one can't really say.

Enterprises, on the other hand, have very compelling reasons to use cloud based services. A cloud service is pretty much an operating expense and less capital intensive than setting up the infrastructure to build an enterprise owned private cloud and provide all their employees a means to access it. There will also be other enterprise level factors to consider when estimating a enterprise's cloud footprint.

Whatever the course, life is not going to come to a standstill going completely cloudy or going completely clear but you will still need an umbrella when it rains.

Sunday 8 February 2015

The Seven Ages of Man

All the world's a stage,
Seven Ages of Man SculptureAnd all the men and women merely players.
They have their exits and their entrances,
And one man in his time plays many parts,
His acts being seven ages. At first the infant,
Mewling and puking in the nurse's arms.
Then, the whining school-boy with his satchel
And shining morning face, creeping like snail
Unwillingly to school. And then the lover,
Sighing like furnace, with a woeful ballad
Made to his mistress' eyebrow. Then, a soldier,
Full of strange oaths, and bearded like the pard,
Jealous in honour, sudden, and quick in quarrel,
Seeking the bubble reputation
Even in the cannon's mouth. And then, the justice,
In fair round belly, with a good capon lined,
With eyes severe, and beard of formal cut,
Full of wise saws, and modern instances,
And so he plays his part. The sixth age shifts
Into the lean and slippered pantaloon,
With spectacles on nose and pouch on side,
His youthful hose, well saved, a world too wide
For his shrunk shank, and his big manly voice,
Turning again toward childish treble, pipes
And whistles in his sound. Last scene of all,
That ends this strange eventful history,
Is second childishness and mere oblivion,
Sans teeth, sans eyes, sans taste, sans everything.

- William Shakespeare


If you are a man, at some point in your life you would have gone through or experienced at least one of the seven stages of man. Of course, the time a man spends in each stage varies from man to man.

Knowing which stage a man is in helps in either managing or being managed in an organization. 

I wonder if there is something similar for women?