Click to See Complete Forum and Search --> : way to find size of arrays during runtime


maccorin
05-31-2004, 04:04 PM
i'm working on an string tracing library (LD_PRELOAD), that writes wrappers around strcpy, strncpy, strlen, strstr, and friends....

I'm writing it as I try to learn more about security and searching for bugs so it would be nice to have it print out the allocated amount of memory for the array that is being passed in.

AFAIK know there is no standard way to do this, but i was able to come up w/ this hack for anything on the stack

...

stack_var_size:
mov %ebp,%eax

__do_comp0:
cmp %eax, 4(%esp)
jl __found0
mov (%eax), %eax
jmp __do_comp0

__found0:
cmp $0x0,%eax
jz __return_0
subl 4(%esp), %eax
ret

...


that doesn't give the actual size usually, just the distance between the pointer and the value of %ebp for the function it was declared in, but that's close enough since that's what i'm really looking for anyways ;p

My problem is w/ variables on the heap, if the pointer is one that was returned by malloc


unsigned long tmp = array;
tmp -= 4;
tmp = *(unsigned long *)tmp;
tmp = (tmp & 0xfffffffe) - 4;


works, (i know it's not portable, i don't care this is only running on my box), but i can't think of any way to find the beginning of a chunk that was returned by malloc if the pointer i have is somewhere in the middle of it.

Does anyone know a way? It would help me out a _lot_.

Strogian
05-31-2004, 04:19 PM
ok, i have no idea what ANY of that code is supposed to do. :D I looked at the asm, it's missing functions so I can't really read that, and the C code doesn't make any sense either. :D unless you're dealing with some sort of data structure that i don't know about. I'm about to draw a picture of this stuff...

But to answer your question, you want to find the beginning of a string, when you only have a pointer to the middle of it? Not possible in C. Might be some system function to do it -- the kernel has to be aware of it's memory allocation, somehow... I'm sure you could slap something together to do what you want, but i don't see a need for it. Just get a pointer to the beginning of the string, like you're supposed to! :D

maccorin
05-31-2004, 04:28 PM
Originally posted by Strogian
ok, i have no idea what ANY of that code is supposed to do. :D I looked at the asm, it's missing functions so I can't really read that, and the C code doesn't make any sense either. :D unless you're dealing with some sort of data structure that i don't know about. I'm about to draw a picture of this stuff...

But to answer your question, you want to find the beginning of a string, when you only have a pointer to the middle of it? Not possible in C. Might be some system function to do it -- the kernel has to be aware of it's memory allocation, somehow... I'm sure you could slap something together to do what you want, but i don't see a need for it. Just get a pointer to the beginning of the string, like you're supposed to! :D

the asm follows the value of the frame pointers up the stack until it finds one that is greater then the address of the variable i'm looking for, and then calculates the difference, the frame pointer is something that is used in the beginning of every function, and (%ebp) will always point to the the previous functions %ebp, because it does


push %ebp
movl %esp, %ebp


%esp is the stack pointer, this won't work if something is compiled w/ -fomit-frame-pointers for obvious reasons ;p

as far as what i have w/ the heap, glibc uses a version of doug lea's malloc (http://gee.cs.oswego.edu/dl/html/malloc.html), so it stores chunks like this

[code]
unsigned long size + 1
data
[code]

when malloc returns after allocating a chunk it returns a pointer to data

so the c-code assumes your using dlmalloc (if your running linux, and a "normal" version of glibc, you are... well a modified version of it)

EDIT:

btw, i'm not really looking for the beginning of a string, i'm looking for the beginning of a chunk returned by malloc (or calloc, or whatever). And I don't have control over what pointer is passed to my function (it's just a library that wraps the various libc string functions)

Strogian
05-31-2004, 04:49 PM
have you considered using a debugger? ;)

maccorin
05-31-2004, 05:47 PM
Originally posted by Strogian
have you considered using a debugger? ;)

Of course, I use gdb all the time (i love that program). And I use objdump for my dissassembly needs.

But neither of those programs do what i want to do.

Strogian
05-31-2004, 06:07 PM
i guess i still don't quite know what you're trying to do then.

if you need to know about all the malloc()'ed memory, why not write the wrapper around malloc()/free(), and just keep tabs on everything?

maccorin
05-31-2004, 06:23 PM
Originally posted by Strogian
i guess i still don't quite know what you're trying to do then.

if you need to know about all the malloc()'ed memory, why not write the wrapper around malloc()/free(), and just keep tabs on everything?

i actually thought of that, and tested it out, it's certainaly possible, but it gets a bit messy, although as i write this i have an idea....

i think all of the *alloc functions eventually call __malloc_int or something like that, i'm gonna check that out in gdb right now...

maccorin
05-31-2004, 10:13 PM
as a follow up, i found something that (seemingly) works

once again, not portable, i think that would be asking to much ;)


void *heap_p;

....

void __attribute__((constructor)) constructor ( )
{
heap_p = sbrk(0);

....

#define SIZE(i) (*(unsigned long *)i & 0xfffffffe)

size_t heap_var_size ( void *var )
{
void *i = heap_p + 4;

while(SIZE(i) + i < var) {
i += SIZE(i);
}

size_t r = i + SIZE(i) - var;
return r;
}


if anyone knows a better way, i'd be interested, wrapping malloc and friends or their _int_*alloc functions seems to be quite a hassle, so I'm avoiding doing that

sploo22
05-31-2004, 10:58 PM
Dude, to tell you the truth, I think you've gone way beyond the skill level of most of us around here. (Well, me anyway.) :D

maccorin
05-31-2004, 11:01 PM
Originally posted by sploo22
Dude, to tell you the truth, I think you've gone way beyond the skill level of most of us around here. (Well, me anyway.) :D

:(

not really, i just spent like 4 hours reading the entire malloc.c over and over until i understood it... ;p

jim mcnamara
06-01-2004, 02:28 PM
www.boost.org has a smart pointers class. These work only in C++, but you can read about how they are implemented. Smart pointers know how long an array is and the starting address of the array.

FWIW.

maccorin
06-01-2004, 03:10 PM
Originally posted by jim mcnamara
www.boost.org has a smart pointers class. These work only in C++, but you can read about how they are implemented. Smart pointers know how long an array is and the starting address of the array.

FWIW.

smart pointers are usually implemented by over-riding all the operator functions, and collecting information and passing to other smart pointers when copied for instance, because i have no control over the applications code (i don't even have the source code sometimes :O) that i'm analyzing, this solution wouldn't work for me.

But it is a _much_ better solution if you do have control.