Is the lack of privacy a real shortcoming of the language, or is our judgment clouded by the old conventions of C++ and Java? Why do we need private variables anyway — at what point does defensive programming become paranoia?
In my last Python article I wrote a short implementation of a read only attribute in a Python class, whilst noting that if you were determined enough you could still change anything you like in a Python object. I’ve got a couple comments since then asking why I didn’t just make the attribute private, or asking if Python supported private variables. In short, it doesn’t — there’s no ‘private’ keyword and no other way of designating fields as private.
Still, it got me wondering: is it possible to somehow implement private variables? Python gives us a lot of flexibility to change how variables are accessed, it is possible to somehow protect class variables?
To be completely private, the field has to be accessible by any means that originate from inside the class it exists in, but not by any other means. This means not only referencing the field normally — that is, i.x for field x of object i — but also taking it from the object dictionary.
I spent the morning trying different strategies to implement a class with private variables — using either old-style or new-style classes. You’d have to override __getattr__ (or __getattribute__ instead in new-style classes) of course, and you can use the inspect module to examine the interpreter frame so you can tell which attempts to access came from a local method and which don’t.
To cut a long story short it’s not possible, by any means that I can see — if you can prove me wrong, send me an email and we’ll publish your solution.
The reason for this is a consequence of how Python treats objects: in Python everything is a dictionary. In operation, classes are little more than syntactic sugar for dictionaries. When you call, for example, the i.add method the interpreter looks firstly in the i.__dict__ dictionary, then the dictionaries belonging to it’s class and base classes in the method resolution order. Whichever field matches first is called.
Now, because traditional access to attributes and methods go through the __getattr__ or __getattribute__ methods you can intercept them there, and it is possible in a number of ways to implement access control for the dot-form, but you can’t restrict access to the dictionary.
Even worse, since the object’s dictionary is used to store and retrieve attributes and methods, you can’t mess with it and expect the class to behave in the same way — if you delete the entry relating to a method in an instance, that method cannot be used, from inside or outside the the class. If you edit the dictionary of the class the instance belongs to, it affects all instances of that class.
Is this such a bad thing? Why is it that we want to lock people out of our classes? I’ve never really thought about it before now: if a field wasn’t used outside the class in my reference code, then it was dutifully protected from the outside world. The problem with this kind of programming is that it leads you into thinking of the people using your code as malicious.
There’s a lot to be said for defensive programming, but after a certain point it’s reducing the power of our code. I could see the argument towards field safety if you were distributing a library and you wanted to reduce the possible bugs others could write using your code, but for the majority of cases it appears that it’s motivated out of a fear that users will tamper with the programmer’s perfect code, in nothing less than a malicious attempt to destroy its purity.
Do we resent others using our work outside our intentions? Some really useful things have been accomplished by using code in ways it’s authors had never intended (and probably never wanted, for that matter).
Next time you reach for the ‘private’ keyword, have a good think about whether it’s necessary, or whether you’re reducing the potential of your code.