Solved Why are Linux files so darn large?

Solved issue

James888

Member
Joined
Jun 13, 2023
Messages
40
Reaction score
17
Credits
487
The following file...

Code:
;***************************************************************************
; Assemble   : nasm -felf64 Example.asm
; Link       : ld -o Example Example.o -lc -ecrt0
;***************************************************************************
global crt0
extern printf
section .text

crt0:
    call printf
    mov rax,0xffff
    ret

... compiles into a huge gigantic 14k file! Why does GNU ld do that and how can I prevent it from doing that? Also, when I open this file in Evan's Debugger, it appears that the start point defaults to the system loader instead of to crt0, as if ld64.a is included instead of linking to ld64.so. The system loader should be part of the image and not the file I'm debugging. Is there anything better optimized (and documented) than GNU ld?
 


 
If this was 30 years ago14kb would've been large.
m1213.gif
 
Do people here understand the difference between static and dynamic linking? That affects application file size.

When you write a program, you often link it to "libraries" of routines that your program uses to open files, communicate on the network, and so forth.

When the libraries are statically linked to your code, all the necessary components from the libraries are also included in the application file that you run.

When the libraries are dynamically linked, the application knows where to find the shared libraries and uses them (links to them) when you run the application. The same libraries are used and shared by many applications. The shared dynamic approach saves drive space, because each application does not require statically linked duplicates of the same common libraries.
 
Do people here understand the difference between static and dynamic linking? That affects application file size.
They would if they knew the difference between a *.a and a *.so file. For Windows users, it's the same as knowing the difference between a *.lib and a *.dll file. :)
When you write a program, you often link it to "libraries" of routines that your program uses to open files, communicate on the network, and so forth.

When the libraries are statically linked to your code, all the necessary components from the libraries are also included in the application file that you run.

When the libraries are dynamically linked, the application knows where to find the shared libraries and uses them (links to them) when you run the application. The same libraries are used and shared by many applications. The shared dynamic approach saves drive space, because each application does not require statically linked duplicates of the same common libraries.
It's the same for Windows and I have a great many Windows programs that do exactly that without any issues like I'm having on Linux. But Linux is not Windows and I'm not familiar with Linux yet, so I have to rely on official documentation for whatever I do, and per the Linux manual page at https://man7.org/linux/man-pages/man1/ld.1.html:
-l namespec
--library=namespec
...

Specifically, on ELF and SunOS systems, ld will search a directory for a library called libnamespec.so before searching for one called libnamespec.a
That is the way all the NASM tutorials explained it as well, i.e. -lc will dynamically link libc.so to your file. If it doesn't do anything but statically link *.a or *.so files, it needs to be specific and say so. It would be awfully stupid of ld to statically link *.so files instead of dynamically linking them without documenting illogical behavior like that, so I hope that isn't what they are doing. I even used -l:libc.so.6 to force it to dynamically link libc.so and it made no difference in file size.

Furthermore, I never linked lib-2.31.so or lib-2.31.a to my executable, yet that is the first thing edb brings up during debugging. It's a mystery how that could happen because I can't find any documentation that says how that could happen.

PS -- Considering my last thread, I was wondering if I need to use --no-dynamic-linker? I don't have enough information from the Linux manual to know what that will do or how I would need to code my program so I could make that work. On Windows, you don't specify a loader or link to a loader, it's just a part of the OS and is automatic.
 
Last edited:
Edit: clarification as I renamed the files as sample[.asm, .o,...] instead of Example

I think this is because ELF 64 format has a lot of metadata.
I copied your example, and I think this is where part of those 14 kB are:
Code:
$ nasm -felf64 sample.asm
$ ld -o sample sample.o -lc -ecrt0
$ readelf -a sample
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x401020
  Start of program headers:          64 (bytes into file)
  Start of section headers:          12760 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         8
  Size of section headers:           64 (bytes)
  Number of section headers:         17
  Section header string table index: 16

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .interp           PROGBITS         0000000000400200  00000200
       000000000000000f  0000000000000000   A       0     0     1
  [ 2] .hash             HASH             0000000000400210  00000210
       0000000000000014  0000000000000004   A       4     0     8
  [ 3] .gnu.hash         GNU_HASH         0000000000400228  00000228
       000000000000001c  0000000000000000   A       4     0     8
  [ 4] .dynsym           DYNSYM           0000000000400248  00000248
       0000000000000030  0000000000000018   A       5     1     8
  [ 5] .dynstr           STRTAB           0000000000400278  00000278
       000000000000001e  0000000000000000   A       0     0     1
  [ 6] .gnu.version      VERSYM           0000000000400296  00000296
       0000000000000004  0000000000000002   A       4     0     2
  [ 7] .gnu.version_r    VERNEED          00000000004002a0  000002a0
       0000000000000020  0000000000000000   A       5     1     8
  [ 8] .rela.plt         RELA             00000000004002c0  000002c0
       0000000000000018  0000000000000018  AI       4    13     8
  [ 9] .plt              PROGBITS         0000000000401000  00001000
       0000000000000020  0000000000000010  AX       0     0     16
  [10] .text             PROGBITS         0000000000401020  00001020
       000000000000000b  0000000000000000  AX       0     0     16
  [11] .eh_frame         PROGBITS         0000000000402000  00002000
       0000000000000000  0000000000000000   A       0     0     8
  [12] .dynamic          DYNAMIC          0000000000402e98  00002e98
       0000000000000150  0000000000000010  WA       5     0     8
  [13] .got.plt          PROGBITS         0000000000402fe8  00002fe8
       0000000000000020  0000000000000008  WA       0     0     8
  [14] .symtab           SYMTAB           0000000000000000  00003008
       00000000000000f0  0000000000000018          15     5     8
  [15] .strtab           STRTAB           0000000000000000  000030f8
       000000000000005b  0000000000000000           0     0     1
  [16] .shstrtab         STRTAB           0000000000000000  00003153
       0000000000000085  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), l (large), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                 0x00000000000001c0 0x00000000000001c0  R      0x8
  INTERP         0x0000000000000200 0x0000000000400200 0x0000000000400200
                 0x000000000000000f 0x000000000000000f  R      0x1
      [Requesting program interpreter: /lib/ld64.so.1]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x00000000000002d8 0x00000000000002d8  R      0x1000
  LOAD           0x0000000000001000 0x0000000000401000 0x0000000000401000
                 0x000000000000002b 0x000000000000002b  R E    0x1000
  LOAD           0x0000000000002000 0x0000000000402000 0x0000000000402000
                 0x0000000000000000 0x0000000000000000  R      0x1000
  LOAD           0x0000000000002e98 0x0000000000402e98 0x0000000000402e98
                 0x0000000000000170 0x0000000000000170  RW     0x1000
  DYNAMIC        0x0000000000002e98 0x0000000000402e98 0x0000000000402e98
                 0x0000000000000150 0x0000000000000150  RW     0x8
  GNU_RELRO      0x0000000000002e98 0x0000000000402e98 0x0000000000402e98
                 0x0000000000000168 0x0000000000000168  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00  
   01     .interp
   02     .interp .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.plt
   03     .plt .text
   04     .eh_frame
   05     .dynamic .got.plt
   06     .dynamic
   07     .dynamic

Dynamic section at offset 0x2e98 contains 16 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000004 (HASH)               0x400210
 0x000000006ffffef5 (GNU_HASH)           0x400228
 0x0000000000000005 (STRTAB)             0x400278
 0x0000000000000006 (SYMTAB)             0x400248
 0x000000000000000a (STRSZ)              30 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x402fe8
 0x0000000000000002 (PLTRELSZ)           24 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x4002c0
 0x000000006ffffffe (VERNEED)            0x4002a0
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x400296
 0x0000000000000000 (NULL)               0x0

Relocation section '.rela.plt' at offset 0x2c0 contains 1 entry:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000403000  000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf@GLIBC_2.2.5 + 0
No processor specific unwind information to decode

Symbol table '.dynsym' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND [...]@GLIBC_2.2.5 (2)

Symbol table '.symtab' contains 10 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS sample.asm
     2: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS
     3: 0000000000402e98     0 OBJECT  LOCAL  DEFAULT   12 _DYNAMIC
     4: 0000000000402fe8     0 OBJECT  LOCAL  DEFAULT   13 _GLOBAL_OFFSET_TABLE_
     5: 0000000000403008     0 NOTYPE  GLOBAL DEFAULT   13 _edata
     6: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND printf@GLIBC_2.2.5
     7: 0000000000401020     0 NOTYPE  GLOBAL DEFAULT   10 crt0
     8: 0000000000403008     0 NOTYPE  GLOBAL DEFAULT   13 _end
     9: 0000000000403008     0 NOTYPE  GLOBAL DEFAULT   13 __bss_start

Histogram for bucket list length (total of 1 bucket):
 Length  Number     % of total  Coverage
      0  0          (  0.0%)
      1  1          (100.0%)    100.0%

Version symbols section '.gnu.version' contains 2 entries:
 Addr: 0x0000000000400296  Offset: 0x000296  Link: 4 (.dynsym)
  000:   0 (*local*)       2 (GLIBC_2.2.5)

Version needs section '.gnu.version_r' contains 1 entry:
 Addr: 0x00000000004002a0  Offset: 0x0002a0  Link: 5 (.dynstr)
  000000: Version: 1  File: libc.so.6  Cnt: 1
  0x0010:   Name: GLIBC_2.2.5  Flags: none  Version: 2

This information points to some interesting facts, like that the file has a number of hashes and checksums aside of those headers, and other structures that carry out the portability and capability of running in various processors that ELF provides.

I am not an assembler connoisseur as last time I coded assembler it was for a Motorola 68000, but maybe you have other formats to set as output if this really bothers you, like for example aiming to just your particular architecture and nothing else. That should definitely alleviate the extra size.
 
Last edited:
Edit: clarification as I renamed the files as sample[.asm, .o,...] instead of Example

I think this is because ELF 64 format has a lot of metadata.
I copied your example, and I think this is where part of those 14 kB are:
Code:
$ nasm -felf64 sample.asm
$ ld -o sample sample.o -lc -ecrt0
$ readelf -a sample

This got me to thinking so I tried a few things:

-s
Reduces file by 1kb. Maybe should not use when debugging?

-no-dynamic-linker
Has no effect on file size but allowed the program to come up in the debugger at location ecrt0. This confirms my suspicion that ld.so was part of the image and not part of the executable. Also, looking at the file I saw lots and lots of filler, e.g. -- nops and zeroes. So I thought "I've seen that before, that has to do with page size", so I tried...

-zmax-page-size=64

Now the file is a nice 2128 bytes in size.

PS -- During my research I came across A WHIRLWIND TUTORIAL ON CREATING REALLY TEENSY ELF EXECUTABLES FOR LINUX which is completely useless for issues like this.

PPS -- For all the "asm connoisseurs" out there like me, I came across this interesting code snippet on GitHub ...

Code:
BITS 64

; ELF header
ehdr:
          db    0x7F, "ELF"         ; magic
          db    2                   ; 64-bit
          db    1                   ; little endian
          db    1                   ; ELF version, always 1
          db    0x03                ; target Linux
  times 8 db    0                   ; padding
          dw    0x02                ; executable
          dw    0x3E                ; x86-64 machine
          dd    0x01                ; ELF version, always 1
          dq    _start              ; execution entry (in memory)
          dq    phdr-ehdr           ; program header (in file)
          dq    0                   ; section header table
          dd    0                   ; flags
          dw    ehdrsize            ; size of this ELF header
          dw    phdrsize            ; size of the program header table entry
          dw    1                   ; number of program headers
          dw    0                   ; size of the section header table entry
          dw    0                   ; number of sections headers
          dw    0                   ; name section index: no names section

ehdrsize  equ   $ - ehdr

; Program header
phdr:
          dd    0x01                ; loadable segment
          dd    5                   ; read+execute segment
          dq    codestart           ; segment content offset (in file)
          dq    _start              ; virtual address in memory
          dq    _start              ; physical address in memory
          dq    codesegsize         ; size in the file
          dq    codesegsize         ; size in memory
          dq    0x1000              ; alignment

phdrsize  equ   $ - phdr
codestart equ   $ - ehdr

; Code segment
_start:
          org 0x400000
          ; the file start is at memory location 0x400000
          ; the code loads at 0x400078. this is necessary to ensure 0x1000
          ; alignment. (on my machine, lower alignments give segfaults)
          mov rax, 1
          mov rdi, 1
          mov rsi, msg
          mov rdx, msglen
          syscall
          mov rax, 60
          mov rdi, 0
          syscall

; data
msg:         db    "Hello, World",10
msglen       equ   $ - msg

codesegsize  equ   $ - _start

I don't really need it and like teensyElf, it needs more work since it doesn't show how to call external functions but it is has a lot of potential for making teensy ELFs for x86-64 using external calls and no linker needed.
 
Hold the phone! I can't use -zmax-page-size like I thought I could. It returns an error of "Failed to open and attach to process: First event after waitpid() should be STOP of type SIGTRAP, but wasn't, instead status=oxb7f". I wonder if there is an English version of GNU ld available? :D
 
I am not a programmer. By that, I mean, I have - but you wouldn't want me to and surely wouldn't pay me to.

I have to ask...

Why are you programming in assembly? Why not some higher-level language like Python? If you need low-level programming, C or C++ are still options.

This is by no means a slight. I'm just curious and this is a hole in my knowledge.

As far as I know, not a whole lot of folks are competent with assembly. If my understanding is correct, and you're willing to work as a contractor and you're any good at it, you can pretty much demand the salary you want, not unlike some of the older languages whose code still needs to be maintained.
 
I am not a programmer. By that, I mean, I have - but you wouldn't want me to and surely wouldn't pay me to.

I have to ask...

Why are you programming in assembly? Why not some higher-level language like Python? If you need low-level programming, C or C++ are still options.

This is by no means a slight. I'm just curious and this is a hole in my knowledge.
There is no hole in your knowledge. I program in assembly because many years ago I had many negative experiences with Microsoft Visual Studio and wanted something better, and assembly sounded ... nerdy cool! Assembly is not any harder than a higher level language (once you've been properly taught how to do it), in fact with a few macros, it looks and acts a lot like the C language. But you can do more things with it than you can in other languages, e.g. -- no issues with multiple type-casting cruft that doesn't really do anything to the data.
As far as I know, not a whole lot of folks are competent with assembly. If my understanding is correct, and you're willing to work as a contractor and you're any good at it, you can pretty much demand the salary you want, not unlike some of the older languages whose code still needs to be maintained.
Hahaha! I've heard that myth too. No one wants an assembly language programmer allegedly because not many people can read assembly. They want readable and maintainable code. It doesn't help when I respond to that allegation with "Have you ever tried to read regex?" or "Have you looked at your own code lately, because I have and it looks like uncommented spaghetti code?" or "How has all of those many C# marshalling functions ever worked out for you when interfacing with embedded code?".
 
Last edited:
I expect Windows to be bloated, not Linux :)

Linux isn't...you're talking about bytes which is nothing these days...1024000 bytes =1 MB...your file is...
2,128 Byte = 0.002128 MB which is nothing...a floppy disk was 1.4 MB again nothing now.
m1204.gif
 
Linux isn't...you're talking about bytes which is nothing these days...1024000 bytes =1 MB...your file is...
2,128 Byte = 0.002128 MB which is nothing...a floppy disk was 1.4 MB again nothing now.
m1204.gif
14kb is bloat -- 2128b is not.
 
Ok, I see part of the problem now. When I used -zmax-page-size=4096, the problem goes away but the bloat returns. 4096 is a magic number. It is the actual size of a page in memory. Is the Linux kernel being lazy and instead of loading data smaller than 4096b and filling the rest of the memory page with zeroes (or whatever they want), it is expecting the program to provide an image in multiple sizes of 4096b? Is there another option that allows the program size to be smaller than the page size without having to write out all those zeroes all for nothing and get rid of the bloat?
 
Ok, I see part of the problem now. When I used -zmax-page-size=4096, the problem goes away but the bloat returns. 4096 is a magic number. It is the actual size of a page in memory. Is the Linux kernel being lazy and instead of loading data smaller than 4096b and filling the rest of the memory page with zeroes (or whatever they want), it is expecting the program to provide an image in multiple sizes of 4096b? Is there another option that allows the program size to be smaller than the page size without having to write out all those zeroes all for nothing and get rid of the bloat?
Ok, problem finally solved! In addition to all the above, all I needed to add was -zcommon-page-size=64. So now, instead of the bloat being in the executable, the bloat is in the command line :p. The command line is now a whopping 97b: ld -o Example Example.o -lc -ecrt0 -s -no-dynamic-linker -zmax-page-size=64 -zcommon-page-size=64, but it works so I'm not complaining (in public).
 
The assembler I wrote was nearly all firmware. I always assumed it was true for most code written in assembler.

I had heard rumors of people who wrote applications in assembler because that was their preferred method. It was almost mythical. I never met anyone who actually did that until now.
 
The assembler I wrote was nearly all firmware. I always assumed it was true for most code written in assembler.

I had heard rumors of people who wrote applications in assembler because that was their preferred method. It was almost mythical. I never met anyone who actually did that until now.

I hate to semi-disappoint you, but since asm is so tedious to write sometimes, I wrote a copyrighted program in asm, that writes GoAsm code that is then assembled, linked, and compiled. My program is very BASIC-like but the asm code is very C-like. For example, here is a snippet of my "hi-level code",

Code:
[m] cbSpn9Star       w64 args=hSpn,dPos
    asm mov eax,[dPos]
        mov [l9Star],eax
    iff dPos=0
        let cGatePlacement2=0 cOrientation2=0
    cll w64 imImageDestroy args=pImImage2
    cll w64 imImageDestroy args=pImImage3
    cll w64 fnCreateHouses
    rtn

and here is its output,

Code:
cbSpn9Star frame hSpn,dPos
    mov eax,[dPos]
    mov [l9Star],eax
    cmp d[dPos],0
    jne >>a0
    mov b[cGatePlacement2],0
    mov b[cOrientation2],0
a0:
z0:
    invoke imImageDestroy,[pImImage2]
    invoke imImageDestroy,[pImImage3]
    invoke fnCreateHouses
    ret
    endf

So my 11184 line program source code for that program becomes a 13,068 line asm source code file, which is then assembled, linked, and compiled by GoAsm and GoLink into a 547.5kb executable file. I had a lot of fun (and frustration) writing those programs (and many others).

I am currently rewriting all my code for everything I ever wrote for Windows so it will run in Linux, so these discussions I'm having are very appreciated and useful.

I'm still on the learning curve here (which can be very frustrating when there is no accurate, reliable help for doing that). For example, I had to learn whereas Windows has exception handlers, Linux has signals. Windows and Linux both have pipes, but to capture the output of GoAsm (which is a command line only program) I had to use CreateProcess and drop down into DOS and save the output to a temp file, whereas Linux has pipes, fork, and dup, which is a way more elegant way of doing things.

So I wish to thank the Linux community here for helping me through these tough times :). I would also like to apologize for acting so Autistic, but I won't because I am Autistic (level 1.5 I think).
 
Last edited:

Members online


Top