The Curious Case of Matching Memory Locations in Python
So I attended a Python meetup yesterday which was quite informative and one of the talks which I found fascinating was about Cleaning the trash in Python
by Rivas. He starts his talk by explaining what is trash
in programming sense, and goes on to compare memory allocation for objects and arrays in C and Python.
So he mentions that when we create a value 100
and reuse it, Python will point it to the older reference of 100
as it already exists. He explains about finding memory locations of objects using the id
method and how they point to the same memory location. What exactly does the id
method do?
id: Return identity of an object
It returns the memory location of an object. This can be used to check if two objects point to the same location. So we will go right ahead and see a few quirks of Python using this.
1]: a = 100
In [
2]: b = 100
In [
3]: id(a)
In [3]: 139944507028096
Out[
4]: id(b)
In [4]: 139944507028096 Out[
All is well and good in Python land, being memory efficient and all when we initialize something with 100
. Let's proceed:
5]: c = 257
In [
6]: d = 257
In [
7]: id(c)
In [7]: 139944437174032
Out[
8]: id(d)
In [8]: 139944437170992 Out[
So what happened exactly? Let's try another example:
9]: e = 256
In [
10]: f = 256
In [
11]: id(e)
In [11]: 139944507033088
Out[
12]: id(f)
In [12]: 139944507033088 Out[
Uh...oh! Any guesses what happened here?
Explanation: What Python does here is that it stores all integers from -5
to 256
in memory because they are most frequently used and so it doesn't create new memory locations for those objects, therefore when you access a number in that range, Python just fetches it from memory and returns the value.
Let me blow your mind even more with the following code:
13]: id(257)
In [13]: 139944437173872
Out[
14]: print(id(257), id(257), id(257) == id(257))
In [139944437171184 139944437171184 True
Woah, what kind of sorcery is this? You must be like, "Hey, you just told me that numbers after 256
get a different address every time, so what happened here?"
Explanation: The reason is that when Python encounters constants in a single statement, it adds them to a dictionary and looks up in the dictionary for every constant. If it finds it in the dictionary, then it reuses the existing memory location. This is not the case in In[5]
and In[6]
because they are two separate statements.
Let's take it up a notch and try exploring more:
15]: def func():
In [= 257
...: a = 257
...: b return id(a) == id(b)
...:
...:
16]: func()
In [16]: True Out[
We found that for numbers above 256
, Python stores them in new memory locations unless they are used in the same line/statement. However in the above example, they are separate statements and still return True
when we compare the memory locations. So why the odd behaviour?
Explanation: The reason behind this is that within the same compiler scope like in a function or a class, everything is stored in a constants
dictionary (consts
in actual code) and hence any time a constant is encountered, it is first checked in the constants
dictionary and if it's present then it doesn't store it in a new memory location and if not then goes ahead and stores it. Python's idea of optimization during compile time. I had this doubt for a long time which was later cleared by Prasanth Raghu, whom I met in one of the future meetups who also had been working with the CPython code for quite some time. Have a look at the following links to get a better understanding of the flow.
- The program starts in the
compiler_function
: link - Reaches
VISIT_SEQ_IN_SCOPE
function call: link - Inside the
if
condition ofVISIT_SEQ_IN_SCOPE
callscompiler_visit_stmt
: link - Evaluates to
Assign_kind
: link - Reaches
VISIT
function call: link - Function definition of
VISIT
: link - Goes to
compiler_visit_expr
: link - The statement is a
Num_kind
so it callsADDOP_O
: link.consts
is used here which comes from the following links: - This internally calls
compiler_addop_o
: link - Calls the
compiler_add_o
function: link - Fetch from dictionary and return if exists: link
Hope the flow is clear, and do let me know if something was confusing.
References
- Video of the talk by Rivas on YouTube
- Python Docs relating to integer objects in memory.
- Understanding
##
in C
Comments
Comments powered by Disqus