v50 Steam/Premium information for editors
  • v50 information can now be added to pages in the main namespace. v0.47 information can still be found in the DF2014 namespace. See here for more details on the new versioning policy.
  • Use this page to report any issues related to the migration.
This notice may be cached—the current version can be found here.

Editing User talk:Jifodus/Dwarf Fortress Utility Framework

Jump to navigation Jump to search

Warning: You are not logged in.
Your IP address will be recorded in this page's edit history.


The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 191: Line 191:
  
 
[[User:Sphr|Sphr]] 03:00, 13 December 2007 (EST)
 
[[User:Sphr|Sphr]] 03:00, 13 December 2007 (EST)
 
== Response to Sphr's Comments + Current Implementation Details ==
 
 
====Response to Sphr's Comments====
 
Correct me if I'm reading your comments incorrectly (it probably wasn't a good idea
 
to respond while my brain is falling asleep).
 
 
I think I've got the basic system down already, some changes will probably still be
 
made (fortunately it's still in development and so the structure can still change).
 
A rough idea, taken straight from the data files as the stand right now:
 
Types[V0_27_169_33E]["raw"] = { size = 1 }; -- size is one, it represents a fixed
 
          -- array of chars which is done through overriding fixed_size
 
Types[V0_27_169_33E]["word"] = { size = 2 };
 
Types[V0_27_169_33E]["dword"] = { size = 2 };
 
Types[V0_27_169_33E]["pointer"] = { size = 4 };
 
Types[V0_27_169_33E]["string"] = { size = 28, members = {
 
buffer = { type = { type = "raw", fixed_size = 16 }, offset = 0x4 },
 
buffer_ptr = { type = "pointer", offset = 0x8 },
 
length = { type = "dword", offset = 0x14 },
 
capacity = { type = "dword", offset = 0x18 }
 
} };
 
Types[V0_27_169_33E]["creature"] = { size = 1636, members = {
 
firstname = { type = "string", offset = 0x000 },
 
nickname = { type = "string", offset = 0x01C },
 
languagename = { type = "langname", offset = 0x038 },
 
customprofession = { type = "string", offset = 0x06C },
 
typeid = { type = { type = "word", fixed_size = 2 }, offset = 0x088 },
 
...
 
unknown1 = { type = { type = "vector", subtypes = { "word" } }, offset = 0x0B4 },
 
...
 
} };
 
AddressMaps[V0_27_169_33E]["main_creatures"] = {
 
type = {
 
type = "vector",
 
subtypes = { type = "pointer", subtypes = { "creature" } }
 
},
 
pointer = 0x01240AC8
 
};
 
Now to explain the above data definition. You have your basic types raw (equivalent to Sphr's
 
array type), word (2-byte integer), pointer (a pointer to a memory location). Then there is
 
the first complex object, the string. The only bit that really needs explaining is the type
 
field of buffer. What happens is the type gets overriden, it takes the basic type (raw) and
 
changes the fixed array size from the default of 1 to 16. Then the internal object managing
 
the type "raw" will correctly read the 16 bytes of the buffer. A similar story for the typeid
 
field of creature structure. The next bit needing explaining is unknown1 of creature, it
 
overrides the vector object to set the subtype to word. Then when utilities start accessing
 
indices to the vector, the framework correctly creates meaningful wrapper objects. The address
 
map example takes the wrapping to a new level, it nests the subtypes.
 
 
There are two data limitations (partially caused by a framework limitations), which
 
prevents directly follow what Sphr suggested, it is unable to nest definitions and you can't
 
extend or override the member map. Meaning, you can't create a vector object inline.
 
 
As for identifing DF versions? This is what's available for the data file:
 
Signatures[V0_27_169_33E] = {
 
pe_timestamp = 0x475B7526,
 
adler32_of_text_section = 0x????????,
 
text_segments = {
 
{ address = 0x00??????, segment_data = "\034\123d_l..." },
 
{ address = 0x00??????, segment_data = "\234\143r*3..." },
 
}
 
}
 
The PE timestamp is currently the only item checked, the rest is for future versions of the
 
framework to use. Also, I avoided CRC due to wikipedia stating there is no standard divisor
 
upon which the CRC is built (there are standards, but not a single standard). Since adler32
 
does have a standard construction, I chose adler32 instead.
 
====Pre-release Implementation Details====
 
(Basically the only reason why I'm including it here is so that the chosen data format
 
actually makes some sort of reasonable sense.)
 
 
The one thing about my framework is that somewhat good, somewhat bad is none of the
 
types are actually hardcoded. Sure for accessing types, there are hardcoded limitations.
 
If there were no memory accessing, the base framework doesn't care about the difference between:
 
Types[V0_27_169_33E]["pointer"] = { size = 4 };
 
and:
 
Types[V0_27_169_33E."x64"]["pointer"] = { size = 8 };
 
However I do have interfaces that wrap access to integers/pointers/floating-point values
 
and they have hardcoded limitations. Pointers does not have much of a problem, because I
 
also have an interface wrapping a pointer and if utility doesn't need the actual address,
 
then the framework can do and store the pointer however it wants.
 
 
A side note about pointers, if the pointer gets changed (and it can only be changed
 
internally to the framework), all the pointers stemming from that pointer get changed
 
appropriately as well. The framework takes advantage of that by having each "memory
 
object" maintain a pointer to where it is in the memory. As the utility maps members
 
of the "memory object" for access it has the pointer wrapper create a new pointer
 
wrapper to the offset location.
 
i.e. cPointer *pointer = base->getAddress(member); // returns a new cPointer object,
 
base maintains full rights to that new cPointer object and will destroy the object
 
when base gets freed.
 
 
What benefits does this have? I have this type of code in the vector wrapper object.
 
if (cache[index] == NULL) {
 
  cPointer *begin_ptr = begin->getAddress(); // begin_ptr is actually just the addressof a member in the begin object
 
  iType *subtype = type->getSubType(0); // first subtype is the type the vector wraps
 
  iMemoryType *member = dfprocess->mapObject(begin_ptr->getAddress(index * subtype->getSize()), subtype);
 
  cache[index] = member;
 
}
 
return cache[index];
 
So now, all the vector wrapper has to do is initialize the index once. Then when the position of the vector suddenly changes in memory (due to DF spamming the creation of new creatures), I don't have to worry about updating the cache. In addition, if the utility has stored any of the returned objects, those objects will still be usable.
 
 
I think I've covered just about everything worth covering.
 
-- [[User:Jifodus|Jifodus]] 04:24, 13 December 2007 (EST)
 
::Forgot to point out something that may caused misunderstanding, everything that was before the "additional notes" part of my comments actually refers to the persistent data format rather than the run-time format :)  I just choose something that is similar to the lua structs that you used.  When defining the persistent data, I think it would be nice to allow various ways of doing the same thing to suit potential users.  And I think even if the specification data is defined inline, the framework you have should be able to resolve it nicely during parsing (e.g. when encountering an inlined map, just create that map first, give it some name/reference and then continue the parsing the current map with the new map id, as if it was defined much earlier).  It can all be resolved nicely into separate entries in your Type map. 
 
::Another thing.  You could extend your map all the way to the process itself.  Looking up something like creature vector is just looking up an offset in the process's map just like any other complex type, so that you don't need to deal with a separate "AddressMaps" structure.  Just a comment though, as you might want to keep things working as they are.
 
<pre>
 
Types[V0_27_169_33E]["process"] = {
 
    members = {
 
        main_creatures = {type = "vector", subtypes = { type = "pointer", subtypes = { "creature" }}.
 
        offset = 0x01240AC8
 
    }
 
}
 
df_process = { type = "process", version = V0_27_169_33E, pointer = 0x00 } ??
 
</pre>
 
::[[User:Sphr|Sphr]] 04:56, 13 December 2007 (EST)
 
:::I just realized something, type inlining would fairly difficult to implement with lua. I've been using lua_next, so I can't assume anything about the order that the keys will be returned. Which basically means I would have to scan the entire tree multiple times, to locate all types first.
 
:::Though I just had an idea, I could always change the way the system parses. If it encounters a type not previously defined, it could create the a new type with that information, and subclass from there. This, of course, would have to come after I enable subclassing/extending of the member map.
 
:::I however, will not get to this for the first release. Down the road, having inline type definitions would be incredibly useful.
 
:::[[User:Jifodus|Jifodus]] 15:37, 13 December 2007 (EST)
 
::::I'm just commenting based on the presumed desired outcome.  No need to get there in one large leap.  Small steps are fine.  Not implementing that is ok too.  It's nice to give more convenience to the end users, but it is not necessary, esp at a cost too great to be economic for the implementors :)  Btw, I hope that your parser does have convenient ways to ignore parts that it doesn't recognize or use?  If we are working towards a common persistent format, chances are, there may be data that are only used by one party and not by the other.  Ideal case is that the unused data gets ignored safely (or just generate warnings without killing the whole process).  I'll be occupied this weekend, but if I have time next week, I'll see if I can come up with a xml alternative to what you have.  (xml easier for me if I'm using existing library, like tiny.  otherwise, I'll have to define the lua grammar manually for the parser, which could take quite a few rounds of grammar debugging.  Once the formats stabilizes, the next step could be a converter tool to freely transform one file format to the other. :) ) [[User:Sphr|Sphr]] 07:57, 15 December 2007 (EST)
 
:::::Heh, at the moment it ignores parts that it doesn't understand or not formatted correctly. In fact, when I was debugging StartProfile, nearly (if not) all the problems I had were caused by incorrectly formatted data. So I will come up with an additional tag:
 
some_member = { type_ptr = "creature" }
 
:::::Which is the same as doing:
 
some_member = { type = { type = "pointer, subtypes = { "creature" } } }
 
:::::Not properly formatting pointers alone caused the majority of my problems. As to writing a parser for the lua data files, why not use the Lua library itself? It's small and relatively easy to use (if you need an example, you can always take a look at my loading code.) [[User:Jifodus|Jifodus]] 00:55, 16 December 2007 (EST)
 

Please note that all contributions to Dwarf Fortress Wiki are considered to be released under the GFDL & MIT (see Dwarf Fortress Wiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

Please sign comments with ~~~~

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)