v50 Steam/Premium information for editors
  • v50 information can now be added to pages in the main namespace. v0.47 information can still be found in the DF2014 namespace. See here for more details on the new versioning policy.
  • Use this page to report any issues related to the migration.
This notice may be cached—the current version can be found here.

User:Andux/Format research

From Dwarf Fortress Wiki
< User:Andux
Revision as of 20:07, 24 May 2011 by Andux (talk | contribs) (→‎Compressed block: a word on CMV files)
Jump to navigation Jump to search

Partly based on User:Rick/Save research.

Common structures

Compressed block

Type Name Description
uint32 Z_Size Size of compressed data
bytes Z_Data Z_Size bytes of zilb-compressed data.

After decompression, the data for each block typically occupies no more than 20000 bytes (except in CMV files, where data size is a factor of the width, height, and number of frames in the block; for example, 200 frames of 80x25 video will occupy 800000 bytes).

String

Type Name Description
uint16 StrLen Length of string in bytes. Can be zero.
bytes StrVal StrLen characters of string data. Not null-terminated.

Note: In memory, strings are generally stored in fixed-size buffers, which are often much smaller than the maximum length one could specify in StrLen; when writing strings to a file, it is recommended to limit their size to less than 128 bytes.

List

A List structure can contain nearly any type of data (including another List); for clarity, I will describe each one as a "List of X" (or "X-list") below, where X is the type of structure the list contains.

Type Name Description
uint32 EntryCount Number of entries in the list
X Entry EntryCount values/structures of type X.

Name struct

Type Name Description
String Name Specifies the first name of a creature.
String Nickname Specifies the nickname of a creature.
int32s WordIndex An array of 7 signed integers, each giving the index into the list/vector of WORD entries of the word used by that name-part; a value of -1 means the name-part is omitted.
unit16s WordForm An array of 7 unsigned int16s, each specifying the form of word to use for the corresponding name-part.
int32 Language Index into the list/vector of TRANSLATION entries; -1 means word data is ignored.
int16 Unknown May relate to how the game assigns titles?

The basic layout of a name is:

<Name>[ <Word[1]><Word[2]>][ the [<Word[3]> ][<Word[4]> ][<Word[5]>-]<Word[6]>][ of <Word[7]>]

Name-parts

Each of the 7 name-parts corresponds to one of the categories from the left panel of the adventurer/fort name selection screen:

  1. Front compound
  2. Rear compound
  3. Adj 1
  4. Adj 2
  5. Hyphen compound
  6. The X
  7. Of X

Word forms

  1. Singular Noun
  2. Plural Noun
  3. Adjective
  4. Prefix
  5. Present (1st)
  6. Present (3rd)
  7. Preterite
  8. Past Participle
  9. Present Participle

Common file types

Compressed data files

The basic structure of DF's compressed data files is simply an array of compressed block structs. The same structure was used for save files in 40d.

I will attempt to describe the uncompressed structure of these files below:

Weird Text Files

Used for announcement/dipscript strings.

These are essentially a List of:

Type Name Description
uint32 StrLen32 Length of string in bytes.
uint16 StrLen Length of string in bytes. Same value as StrLen32.
bytes StrVal StrLen characters of string data. Not null-terminated.
data/index

The index file uses a special case of the format used for announcements and dipscripts. The characters in each string are encoded using a simple symmetric cipher:

for i := 0 to (StrLen - 1) do
  c[i] := (not c[i]) - (i mod 5);


0.31.xx

Save files

The format used for saves since 0.31.01 differs from that used in 40d. Each save file now begins with a header:

Type Name Description
uint32 save_version Specifies which version of the save structure was used when writing this file.
uint32 is_compressed Is the file compressed? 1 = compressed, 0 = uncompressed

The header itself is always uncompressed, but in compressed saves, any data which follows will be stored in compressed blocks.

Common structures

Beast defs

The beast defs are composed of a List of string-lists; each string-list is a List of String structures which contain the raw tags defining a forgotten beast, demon, titan, or night creature.

String tables

The string tables enumerate all the raw section names used by the save.

Each section is a List of String structures containing the name of a raw object; there are a total of 19 sections:

  1. INORGANIC materials
  2. PLANT types
  3. BODY types
  4. BODYGLOSS entries
  5. CREATURE entries
  6. ITEM types
  7. BUILDING types (custom workshops)
  8. ENTITY classes
  9. WORD entries
  10. SYMBOL entries
  11. TRANSLATION entries (languages)
  12. COLOR entries
  13. SHAPE entries
  14. COLOR_PATTERN entries (also includes COLOR entries)
  15. REACTION entries
  16. MATERIAL_TEMPLATE types
  17. TISSUE_TEMPLATE types
  18. BODY_DETAIL_PLAN entries
  19. CREATURE_VARIATION types

WORLD.DAT

See WORLD.DAT

WORLD.DAT files contain data for worlds in which no adventurer or fortress is currently active.

WORLD.SAV

See WORLD.SAV

WORLD.SAV files contain data for worlds which have an active adventurer or fortress.