Skip to content

Latest commit

 

History

History
executable file
·
546 lines (378 loc) · 15.4 KB

File metadata and controls

executable file
·
546 lines (378 loc) · 15.4 KB

Recursion

/****************************************************************************************
                   Recursion and Call Stack

    1.  Observe the stack frames when calling a recursive function

    2.  Different versions/instances of the local variable 'n' in Factorial(n)

    3.  The Memory Layout of a Linux Process (userspace)

                                                                    COMP9024

 ******************************************************************************************/

Non-recursive factorial function

#include <stdio.h>
#include <assert.h>

// !   Exclamation mark
// F(n) = n! = 1 * 2 * 3 * ... * (n-1) * n
long F(long n) {
    long result = 1;
    // assert(n >= 0);
    for (long i = 1; i <= n; i++) {
        result = result * i;
        // result *= i;
    }
    return result;
}

int main(void) {
    long n = 4;
    long x = F(n);
    printf("F(%ld) = %ld\n", n, x);
    return 0;
}

Function Call and Return

 High Address
              |             | 
              |_____________|
              |             | 
              | return addr |
              |             |
          n   |     4       |  
          x   |             |  main()'s stack frame  
              |             |
              |_____________| 
              |             |
          n   |     4       |  
              | return addr |  F()'s stack frame
              |             |
      result  |             | 
         i    |             |
              |             |  
              |             | 
              |             |
              |_____________| 
              |             |
                Call Stack
Low Address

Recursive functions

Recursive functions are functions that call themselves directly or indirectly within their definitions.

They are a powerful tool in programming, particularly for tasks that exhibit repetitive or self-similar structures.

Factorial(n) = n! = 1 * 2 * 3 * ... * (n-1) * n

Base case:     
    
    Factorial(n) = 1,    when n is 1

Recursive definition:

    Factorial(n) = 1 * 2 * 3 * ... * (n-1) * n
                 = Factorial(n-1) * n
Motivation:

    Break a large problem down into smaller, similar problems.
long Factorial(long n) {
    if (n <= 1) {
        // the base case for the recursive function
        return 1;
    } else {
        // return Factorial(n-1) * n;
        long result = Factorial(n - 1);
        result = result * n;
        return result;
    }
}

However, when using recursion, it's important to understand how it interacts with the call stack.

Call Stack (ignoring return addresses) Program

Call Stack

Note that a call stack is a stack data structure (Last-In-First-Out) that stores information about the active functions of a running program,

which is supported by the operating system and CPU (or by an interpreter).

Interestingly, recursive functions depend on the call stack.

The call stack also operates on a Last-In-First-Out principle.

This implies that the function most recently invoked will be the first one to be removed from the call stack.

Stack Frame

Each time a function is called, a new stack frame (containing the return address, local variables, parameters, and temporary variables)

is created and pushed onto the call stack.

The Memory Layout of a Linux Process (userspace)

A process can be thought of as an instance of a running program.

When you start a program on your computer, the operating system creates a process to manage that program's execution.

This process contains various resources, including memory, CPU time, and other system resources,

which are allocated to the program as it runs.

Due to dynamic linking, the actual layout might be more complex than the following figure.

     |_____________________|
     |  command line args  |
     |        and          |
     |environment variables|
     |_____________________| 
     |                     | Call Stack
     |                     |  
     |                     |  
     |.....................| 
     |                     |
     |                     |  
     |                     |   
     |                     |
     |                     |  Heap memory area (malloc() and free())
     |                     |
     |                     |  
     |                     |   
     |_____________________| 
     |                     |
     |                     |  Global memory area
     |                     |  
     |_____________________|  
     |                     |
     |                     |  Code Area
     |_____________________| 
     |                     |

    The Memory Layout of a Linux Process (userspace)

An executing program often partitions memory into:

Code area
    contains the machine code instructions for the program

    In C:
    after compiling, the machine code of functions will be put in the code area.
    
    // visible in all *.c files
    int test1(void) {
        return 2024;
    }

    // visible in the current *.c file containing this function
    static int test2(void) {
        return 2025;
    }

Data area

    Global memory area
        contains global/static variables and constant strings.        
        Its size is often determined by the compiler and linker when building the program, not at run time.

        global/static variables in C:

        Their lifetime extends throughout the entire execution of the program.

        // global variable, visible in all *.c files
        long year = 2024;

        // The default value of a global/static variable is 0
        // The static variable 'number' is visible in the current C file
        static int number;

        // "CSE@UNSW" is a constant string.
        // cptr is a pointer which points to the first character of "CSE@UNSW"
        char *cptr ="CSE@UNSW";

        void f(void){
            // 'count' is only visible in f()
            static int count = 0;            
            // How many times f() has been called
            count++;
        }


    Heap memory area 
        dynamically allocated via malloc()/free(), 
        C programers are responsible for calling malloc() and free().

        Python has automatic garbage collection for managing its heap memory.
        The main garbage collection mechanism in CPython is through reference counts.

        Variables allocated on the heap have a dynamic lifetime, controlled by the programmer.
        A heap-allocated variable itself doesn't have any name,
        so it should be accessed via a pointer variable.

        In C:

        void test(void) {
            long *pLong = (long *) malloc(sizeof(long));
            *pLong = 2024;
            // ...
            free(pLong);
        }

        Memory Layout:
                       --------
        pLong ---->      2024 
                       --------
                     heap-allocated variable (unnamed)

    Call stack 
        dynamically-allocated due to function calls/returns.
        The push operation (creating a new frame) is caused by a function call (generated by the compiler),
        while its pop operation (removing a frame) is caused by a function return (generated by the compiler).
        It consists of stack frames (First-In-Last-Out), one for each called function,
        Each frame might contain the return address, local variables, parameters, and temporary variables.

        Since different functions can have different sets of local variables, their stack frames vary in size.

        Local variables are declared within a function and are only accessible within that function's scope.
        Their lifetime is limited to the duration of the function call in which they are declared.

        Parameters and local variables in C:

        static void g(int n){
            int x;
            int *ptr = &n;
            // ...
        }

Linux Programmer's Manual

$ man malloc

#include <stdlib.h>

// "/usr/lib/gcc/x86_64-linux-gnu/11/include/stddef.h"
// typedef long unsigned int size_t;

void *malloc(size_t size);
void free(void *ptr);

DESCRIPTION
       The malloc() function allocates size bytes and returns a pointer to the
       allocated  memory.   The memory is not initialized.  If size is 0, then
       malloc() returns either NULL, or a unique pointer value that can  later
       be successfully passed to free().

       The  free()  function  frees  the memory space pointed to by ptr, which
       must have been returned by a previous call to  malloc(),  calloc(),  or
       realloc().   Otherwise, or if free(ptr) has already been called before,
       undefined behavior occurs.  If ptr is NULL, no operation is performed.

1 How to download this project in CSE VLAB

Open a terminal (Applications -> Terminal Emulator)

$ git clone https://github.com/sheisc/COMP9024.git

$ cd COMP9024/Stacks/Recursion

Recursion$ 

2 How to start Visual Studio Code to browse/edit/debug a project.

Recursion$ code

Two configuration files (Recursion/.vscode/launch.json and Recursion/.vscode/tasks.json) have been preset.

2.1 Open the project in VS Code

In the window of Visual Studio Code, please click "File" and "Open Folder",

select the folder "COMP9024/Stacks/Recursion", then click the "Open" button.

2.2 Build the project in VS Code

click Terminal -> Run Build Task

2.3 Debug the project in VS Code

Open src/main.c, and click to add a breakpoint (say, line 32).

Then, click Run -> Start Debugging

2.4 Directory

├── Makefile             defining set of tasks to be executed (the input file of the 'make' command)
|
├── README.md            introduction to this tutorial
|
├── src                  containing *.c and *.h
|   |
│   |── main.c           containing a recursive function, long Factorial(long n)
|
|
└── .vscode              containing configuration files for Visual Studio Code
    |
    ├── launch.json      specifying which program to debug and with which debugger,
    |                    used when you click "Run -> Start Debugging"
    |
    └── tasks.json       specifying which task to run (e.g., 'make' or 'make clean')
                         used when you click "Terminal -> Run Build Task" or "Terminal -> Run Task"

Makefile is discussed in COMP9024/C/HowToMake.

3 Build and run the project in command line

In addition to utilizing VS Code, we can also compile and execute programs directly from the command line interface as follows.

The address (e.g., 0x7fff73cf8a28) might differ due to Address Space Layout Randomization (ASLR).

Recursion$ make

Recursion$ ./main

main():      n = 4, &n = 0x7fff73cf8a28
Factorial(): n = 4, &n = 0x7fff73cf89f8
Factorial(): n = 3, &n = 0x7fff73cf89c8
Factorial(): n = 2, &n = 0x7fff73cf8998
Factorial(): n = 1, &n = 0x7fff73cf8968

Factorial(4) = 24

4 A similar test case in our large assignment

The following test case is not written in C, but with a simple language defined in our large assignment.

// COMP9024/LargeAssignment/tests/Factorial.scc

long result;

// f(n) = 1 * 2 * ... * (n-1) * n
f(n) {
  if (SccLessEqual(n, 1)) { // n <= 1
    return 1;
  }
  else {
    return f(n-1) * n;
  }
}


main(argc, argv, env) {
  long i;
  
  i = 4;
  result = f(i);
  output(result);

  return 0;
}

How to build and run the test case

LargeAssignment$ make

LargeAssignment$ make compileOne SCC_SRC_FILE=tests/Factorial.scc

./scc   < tests/Factorial.scc   > tests/Factorial.scc.s

    x86_64 assembly code generated in tests/Factorial.scc.s


LargeAssignment$ make runOne SCC_SRC_FILE=tests/Factorial.scc

24

Factorial.scc.s is assembly file generated by our large assignment, SCC, a simple C-like compiler.

You can take a look at Factorial.scc.s.

The assembly code is clearly commented line by line.

If you want to know "The Stack Layout in SCC When A Function Is Called", please refer to LargeAssignment/src/func.c.

The comments in EmitAstExprNode() in LargeAssignment/src/expr.c, and EmitPrologue() and EmitEpilogue() in LargeAssignment/src/emit.c

could help you understand the assembly code in Factorial.scc.s.

You can also refer to Guide to x86-64.

We will discuss the details in the coming weeks.