The Curious Case of Matching Memory Locations in Python
So I attended a Python meetup yesterday which was quite informative and one of the talks which I found fascinating was about
Cleaning the trash in Python by Rivas. He starts his talk by explaining what is
trash in programming sense, and goes on to compare memory allocation for objects and arrays in C and Python.
So he mentions that when we create a value
100 and reuse it, Python will point it to the older reference of
100 as it already exists. He explains about finding memory locations of objects using the
id method and how they point to the same memory location. What exactly does the
id method do?
id: Return identity of an object
It returns the memory location of an object. This can be used to check if two objects point to the same location. So we will go right ahead and see a few quirks of Python using this.
In : a = 100 In : b = 100 In : id(a) Out: 139944507028096 In : id(b) Out: 139944507028096
All is well and good in Python land, being memory efficient and all when we initialize something with
100. Let's proceed:
In : c = 257 In : d = 257 In : id(c) Out: 139944437174032 In : id(d) Out: 139944437170992
So what happened exactly? Let's try another example:
In : e = 256 In : f = 256 In : id(e) Out: 139944507033088 In : id(f) Out: 139944507033088
Uh...oh! Any guesses what happened here?
Explanation: What Python does here is that it stores all integers from
256 in memory because they are most frequently used and so it doesn't create new memory locations for those objects, therefore when you access a number in that range, Python just fetches it from memory and returns the value.
Let me blow your mind even more with the following code:
In : id(257) Out: 139944437173872 In : print(id(257), id(257), id(257) == id(257)) 139944437171184 139944437171184 True
Woah, what kind of sorcery is this? You must be like, "Hey, you just told me that numbers after
256 get a different address every time, so what happened here?"
Explanation: The reason is that when Python encounters constants in a single statement, it adds them to a dictionary and looks up in the dictionary for every constant. If it finds it in the dictionary, then it reuses the existing memory location. This is not the case in
In because they are two separate statements.
Let's take it up a notch and try exploring more:
In : def func(): ...: a = 257 ...: b = 257 ...: return id(a) == id(b) ...: In : func() Out: True
We found that for numbers above
256, Python stores them in new memory locations unless they are used in the same line/statement. However in the above example, they are separate statements and still return
True when we compare the memory locations. So why the odd behaviour?
Explanation: The reason behind this is that within the same compiler scope like in a function or a class, everything is stored in a
constants dictionary (
consts in actual code) and hence any time a constant is encountered, it is first checked in the
constants dictionary and if it's present then it doesn't store it in a new memory location and if not then goes ahead and stores it. Python's idea of optimization during compile time. I had this doubt for a long time which was later cleared by Prasanth Raghu, whom I met in one of the future meetups who also had been working with the CPython code for quite some time. Have a look at the following links to get a better understanding of the flow.
- The program starts in the
VISIT_SEQ_IN_SCOPEfunction call: link
- Inside the
- Evaluates to
VISITfunction call: link
- Function definition of
- Goes to
- The statement is a
Num_kindso it calls
constsis used here which comes from the following links:
- This internally calls
- Calls the
- Fetch from dictionary and return if exists: link
Hope the flow is clear, and do let me know if something was confusing.
- Video of the talk by Rivas on YouTube
- Python Docs relating to integer objects in memory.
CommentsComments powered by Disqus