What is a circular pointer?
Full disclosure, I almost cannot think of any practical value to this post. It's provided also entirely as a "Huh, that's a weird thing you can do in C", rather than an actually useful technique. The only purpose I can see is to use them for a circularly linked list with one element. There may be another term that's in use, but I do not know what it is.
A circular pointer is a pointer that, well, points to itself! No matter how many times we dereference that pointer, we should still be at the same spot. For example,
Another way to think about it is ****p == p
. Let's try to think of
some ways to implement this!
Failed Attempt
The first thought that might come to our mind might be something like what's below. Because (spoilers) this code seg faults, I wrote a signal handler so we still see some kind of results.
Calling most library functions in signal handlers is a bad idea, so
I'm using the write()
system call. Headers are excluded for
compactness. I'm using fflush()
as printf()
buffers its output.
Otherwise, the first printf()
will not appear. Lastly, even though
an error occurs, I'm exiting with a zero exit code because of
=org-babel= limitations. You can find out more about this in [*Crash
and burn programming](this post).
void segfault(int sig_num) {
fflush(stdout);
write(STDOUT_FILENO, "Segfault!\n", 9);
_exit(0);
}
int main(void) {
signal(SIGSEGV, segfault);
int *p;
printf("p 0x%x\n", p);
,*p = p; /* segfault occurs here */
printf("****p %x", ****(int ****)p);
}
#+RESULTS:
: p 0x0
: Segfault!
If we walk through the code, we can figure out the problem. We
allocate space for an integer pointer p, which contains 0. We then
attempt to store the value in p, 0, in the value pointed to by p, 0.
In other words, we're attempting to dereference a NULL
pointer,
causing a seg fault.
Some background
Okay, so all we need to do is malloc()
space for the pointer p to
point to. Easy, right? Before we get to that though, there's a few
things we have to keep in mind.
Malloc with multiple levels of pointers
First, we need to be careful allocating space for variables when using multiple levels of pointers. For example,
int **p = malloc(sizeof (int));
,**p = 1;
will seg fault! p is a pointer to a pointer containing an int. When we
malloc memory, malloc will return a pointer to a block of memory large
enough to contain an int. If we type **p
, we're saying "Go to the
malloced block, then go to where the malloced block is pointing at
and set that block to 1". Because malloc()
'd memory can be anything,
we are accessing memory at random! This is an easy recipe for a seg
fault. One way to fix this is to use multiple variables, for
instance...
int *p1 = malloc(sizeof (int));
int **p2 = &p1;
,**p2 = 7;
printf("***p2 %d", **p2);
#+RESULTS:
: ***p2 7
Compile time errors dereferencing pointers
Next up, if we have int *p
, then p
is an int *
. If we try to do **p
, our
compiler will complain about an invalid type argument and fail to compile. You can't
dereference an int *
two times!
The compiler helpfully assumes that you would never dereference a pointer more than the number of pointers you have. In other words,
int *p = malloc(sizeof (int));
printf("*p1 %d\n", *p);
will compile, but
int *p = malloc(sizeof (int));
printf("**p1 %d\n", **p);
will not compile. The compiler sees that *p1 is an int pointer, so it
knows **p
is always invalid. If we pretend seg faults don't exist
for a second, we can fix this issue using casts! A cast tells the
compiler "Hey, I know that p
is an integer pointer, but I need you
to treat this as another type for right now".
int *p = malloc(sizeof (int));
printf("**p %l", **(int **)p);
In this case, we're saying "Hey, you know that integer pointer p? Let's treat it as pointer to a pointer to an integer for right now".
sizeof (int *)
vs sizeof (int)
The last thing to remember deals with sizes. In 64-bit systems, a
pointer is 64-bits long, regardless of the type it points to. sizeof int* == sizeof char* == sizeof double** == 8
. Using what we've
learned so far, let's say we have the following code block.
int *p = malloc(sizeof (int));
,*p = (int)p;
printf("**p %d", **(int **)p);
We've managed to get everything right! Except for one little part.
*p
is an integer, not an integer pointer. We store the integer
pointer p
in *p
. Because sizeof (int *) == 8
and sizeof (int) == 4
(on my machine), part of the pointer is chopped off! p
needs
to point to a type that is the same size (or larger) than an integer
pointer.[fn:2:Or any non-function pointer. There is no guarantee that
a function pointer is the same size as a integer/double/etc pointer,
and they're more of a abstraction that exists in our code than
something that's really "there", like integers in memory.]
The C standard actually includes an integer type that is guaranteed to
be the same size as a pointer, the signed intptr_t
or unsigned
uintptr_t
types defined in =stdint.h=. I'll stay away from these
since I'm worried it'll make it a bit more confusing.
Time to put all of this into the most pointless (HAHAHAHAHA) practice imaginable.
Circular pointers with malloc()
!
Let's take what we've learned and try to create a pointer that points to itself, and one where we can dereference it as many times as we want!
int main(void) {
long *p = malloc(sizeof (long *));
,*p = (long)p; /* cast isn't required here */
printf("p 0x%lx\n", p);
printf("*p 0x%lx\n", *p);
printf("**p 0x%lx\n", **(long **)p);
printf("***p 0x%lx\n", ***(long ***)p);
}
#+RESULTS:
| p | 0x558366a4c2a0 |
| *p | 0x558366a4c2a0 |
| **p | 0x558366a4c2a0 |
| ***p | 0x558366a4c2a0 |
Look at that! We've successfully created a pointer that points to itself! No matter how many times we dereference it, we are still looking at the same pointer.
For fun, let's create a loop so we can deference the pointer as many times as we want very easily.
int main() {
long *p = malloc(sizeof (long *));
,*p = (long)p;
for (int i = 0; i < 5; i++) {
long ptemp = *p;
printf("%d 0x%lx\n", i, *p);
p = (long *)ptemp;
}
}
| i | dereference |
|---+----------------|
| 0 | 0x55ffc15d32a0 |
| 1 | 0x55ffc15d32a0 |
| 2 | 0x55ffc15d32a0 |
| 3 | 0x55ffc15d32a0 |
| 4 | 0x55ffc15d32a0 |
And there we go. This is one option for how we can implement a pointer that points to itself, no matter how many times we dereference it. If we wanted to, we add more pointers, creating a circularly linked list that looks like
Circular pointers with structs!
Another option that saves us from all this casting is to use structs. Because a struct can contain anything[fn:3:Except for a struct containing itself directly.], even a pointer to a struct of the same type, we can do the following.
struct p_self {
struct p_self *p;
int magic;
};
int main() {
struct p_self p = { .p = &p, .magic = 4032 };
printf("magic %d", p.p->p->p->p->p->magic);
}
#+RESULTS:
: magic 4032
To my knowledge, we can't make p
a pointer to a struct very
compactly. If we create a compound literal like
struct p_self *p = &(struct p_self){ .p = /* problem */, .magic = 4032 };
we have a problem. The compound literal needs to contain a field that contains it's own address. I don't believe it is possible to do this, as there's no way to refer to the compound literal we are inside of. (This could be avoided by not using compound literals and introducing a second variable, but I'd rather not.)
Regardless, what we can conclude from this is that pointers are weird and confusing and there's many little ways to mess up, especially as we do increasingly weird stuff with them. But it's fun!