The Curious Case of Matching Memory Locations in Python
So I attended a Python meetup
yesterday which was quite informative and one of the talks which I found
fascinating was about Cleaning the trash in Python
by
Rivas. He starts his talk by explaining what is trash
in
programming sense, and goes on to compare memory allocation for objects
and arrays in C and Python.
So he mentions that when we create a value 100
and reuse
it, Python will point it to the older reference of 100
as
it already exists. He explains about finding memory locations of objects
using the id
method and how they point to the same memory
location. What exactly does the id
method do?
id: Return identity of an object
It returns the memory location of an object. This can be used to check if two objects point to the same location. So we will go right ahead and see a few quirks of Python using this.
1]: a = 100
In [
2]: b = 100
In [
3]: id(a)
In [3]: 139944507028096
Out[
4]: id(b)
In [4]: 139944507028096 Out[
All is well and good in Python land, being memory efficient and all
when we initialize something with 100
. Let's proceed:
5]: c = 257
In [
6]: d = 257
In [
7]: id(c)
In [7]: 139944437174032
Out[
8]: id(d)
In [8]: 139944437170992 Out[
So what happened exactly? Let's try another example:
9]: e = 256
In [
10]: f = 256
In [
11]: id(e)
In [11]: 139944507033088
Out[
12]: id(f)
In [12]: 139944507033088 Out[
Uh...oh! Any guesses what happened here?
Explanation: What Python does here is that it stores all
integers from -5
to 256
in memory because they
are most frequently used and so it doesn't create new memory locations
for those objects, therefore when you access a number in that range,
Python just fetches it from memory and returns the value.
Let me blow your mind even more with the following code:
13]: id(257)
In [13]: 139944437173872
Out[
14]: print(id(257), id(257), id(257) == id(257))
In [139944437171184 139944437171184 True
Woah, what kind of sorcery is this? You must be like, "Hey, you
just told me that numbers after 256
get a different address
every time, so what happened here?"
Explanation: The reason is that when Python encounters
constants in a single statement, it adds them to a dictionary and looks
up in the dictionary for every constant. If it finds it in the
dictionary, then it reuses the existing memory location. This is not the
case in In[5]
and In[6]
because they are two
separate statements.
Let's take it up a notch and try exploring more:
15]: def func():
In [= 257
...: a = 257
...: b return id(a) == id(b)
...:
...:
16]: func()
In [16]: True Out[
We found that for numbers above 256
, Python stores them
in new memory locations unless they are used in the same line/statement.
However in the above example, they are separate statements and still
return True
when we compare the memory locations. So why
the odd behaviour?
Explanation: The reason behind this is that within the same
compiler scope like in a function or a class, everything is stored in a
constants
dictionary (consts
in actual code)
and hence any time a constant is encountered, it is first checked in the
constants
dictionary and if it's present then it doesn't
store it in a new memory location and if not then goes ahead and stores
it. Python's idea of optimization during compile time. I had this doubt
for a long time which was later cleared by Prasanth Raghu, whom I met in
one of the future meetups who also had been working with the CPython
code for quite some time. Have a look at the following links to get a
better understanding of the flow.
- The program starts in the
compiler_function
: link - Reaches
VISIT_SEQ_IN_SCOPE
function call: link - Inside the
if
condition ofVISIT_SEQ_IN_SCOPE
callscompiler_visit_stmt
: link - Evaluates to
Assign_kind
: link - Reaches
VISIT
function call: link - Function definition of
VISIT
: link - Goes to
compiler_visit_expr
: link - The statement is a
Num_kind
so it callsADDOP_O
: link.consts
is used here which comes from the following links: - This internally calls
compiler_addop_o
: link - Calls the
compiler_add_o
function: link - Fetch from dictionary and return if exists: link
Hope the flow is clear, and do let me know if something was confusing.
References
- Video of the talk by Rivas on YouTube
- Python Docs relating to integer objects in memory.
- Understanding
##
in C
Comments
Comments powered by Disqus