v50 Steam/Premium information for editors
  • v50 information can now be added to pages in the main namespace. v0.47 information can still be found in the DF2014 namespace. See here for more details on the new versioning policy.
  • Use this page to report any issues related to the migration.
This notice may be cached—the current version can be found here.

Editing User:Vasiln/Goblin Logic 3

Jump to navigation Jump to search

Warning: You are not logged in.
Your IP address will be recorded in this page's edit history.


The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
 
==Optimization: Opcodes revisited==
 
==Optimization: Opcodes revisited==
 +
 
Right now, we have seven operations codes to spare: 0000, 1010, 1011, 1100, 1101, 1110, and 1111; and 15 arguments for XOR; and 8 arguments for bit shift or bit rotate.  Way more than we need.  At the same time, we don't have nearly enough memory.  We could do a couple of things: we could add instructions, so that for instance it wouldn't take two instructions to add two numbers.  We could also switch it up to a 3-bit instruction and a 5-bit address.
 
Right now, we have seven operations codes to spare: 0000, 1010, 1011, 1100, 1101, 1110, and 1111; and 15 arguments for XOR; and 8 arguments for bit shift or bit rotate.  Way more than we need.  At the same time, we don't have nearly enough memory.  We could do a couple of things: we could add instructions, so that for instance it wouldn't take two instructions to add two numbers.  We could also switch it up to a 3-bit instruction and a 5-bit address.
  
Line 15: Line 16:
 
Two of these need specifying.
 
Two of these need specifying.
  
=====100xxxxx Register operations=====
+
100xxxxx Register operations
 +
 
 
  *10000000 XOR a and b -> a
 
  *10000000 XOR a and b -> a
 
  *10000001 Add: XOR a and b -> a, write 9-bit carry word to buffer, XOR a and carry word->a, write 8-bit carry word to buffer, with carry(8)=output of first XOR(8) OR output of second XOR(8)
 
  *10000001 Add: XOR a and b -> a, write 9-bit carry word to buffer, XOR a and carry word->a, write 8-bit carry word to buffer, with carry(8)=output of first XOR(8) OR output of second XOR(8)
Line 26: Line 28:
 
  *10001000-10011111 Reserved (24 choices!)
 
  *10001000-10011111 Reserved (24 choices!)
  
=====101xxyyy Bit shifting register operations=====
+
101xxyyy Bit shifting register operations
 +
 
 
  *10100yyy Arithmetic right shift 0-7 units (y units)
 
  *10100yyy Arithmetic right shift 0-7 units (y units)
 
  *10101yyy Arithmetic left shift 0-7 units (y units)
 
  *10101yyy Arithmetic left shift 0-7 units (y units)
Line 43: Line 46:
  
 
===A note on the notion of "Turing complete"===
 
===A note on the notion of "Turing complete"===
 +
 
To be Turing complete, a computer has to have four things: a NAND, a NOR, a jump if, and infinite memory.
 
To be Turing complete, a computer has to have four things: a NAND, a NOR, a jump if, and infinite memory.
  
Line 54: Line 58:
  
 
==Optimization: Memory Addressing Revisited==
 
==Optimization: Memory Addressing Revisited==
 +
 
We spent a lot of time optimizing our adder, and ended up with a parallel add/look-ahead that was very fast-- less than half the time of our parallel ripple, and far faster than a full adder.  What I forgot to mention is that there was a lot of stupid delay built into the add just by virtue of requiring two memory accesses (two XOR operations).  We've been assuming that a single goblin will address all of our bits in a particular bit location, but that takes a lot of time: trip the hatch takes 100 ticks just to get him started, and with 32 bits to address, we're probably looking at something like 30 moves to make it to our addressable bit-- so we've got around 400 ticks just to read memory/buffer, followed by 100 ticks of latency for our memory.
 
We spent a lot of time optimizing our adder, and ended up with a parallel add/look-ahead that was very fast-- less than half the time of our parallel ripple, and far faster than a full adder.  What I forgot to mention is that there was a lot of stupid delay built into the add just by virtue of requiring two memory accesses (two XOR operations).  We've been assuming that a single goblin will address all of our bits in a particular bit location, but that takes a lot of time: trip the hatch takes 100 ticks just to get him started, and with 32 bits to address, we're probably looking at something like 30 moves to make it to our addressable bit-- so we've got around 400 ticks just to read memory/buffer, followed by 100 ticks of latency for our memory.
  
Line 87: Line 92:
 
You might notice two problems.  Before, we were using a goblin to tell us if memory r/w was complete.  Then, he had eight doors to walk through.  Now, to do the same thing, he'd need 256 doors to walk though.  The sheer distance would negate any speed benefit by independently addressed memory.  How can we work around that?
 
You might notice two problems.  Before, we were using a goblin to tell us if memory r/w was complete.  Then, he had eight doors to walk through.  Now, to do the same thing, he'd need 256 doors to walk though.  The sheer distance would negate any speed benefit by independently addressed memory.  How can we work around that?
  
The other problem is that this memory is 19 tiles wide.  We can't fit 8 of these in a 3 square embark.  Let's do some rearranging.
+
The other problem is that this memory is 19 tiles wide.  We can't fit 8 of these in a 3 square embark.
  
          ###
 
        ###d##                Set=True
 
      ##^hh^##            False    True
 
  ######h#d##h#                        Set=False
 
## d#hX####hX#
 
# ##^###hd^####          Write True to Memory
 
# # ### ###d  ##
 
# # h## ###### #
 
# # h # ###hX# #
 
# # h # hh^### #          Write False to Memory
 
# # h # ###d  #
 
# ##h # ###### #
 
#h#d ## ###hX# #
 
##^##d# dd^### #          Write True to Buffer
 
#h#h#d# ###d  #
 
#X# #d# ###### #
 
# #d#d# ###hX# #
 
# #^##d#dh^### #          Write False to Buffer
 
##h#h######d  ##
 
  ####      ###
 
    #########
 
  
I've consolidated the addressing doors-- in fact, I have room for one more addressing door (which we just might use).  I've rearranged the r/w functions to optimize for writes to buffer instead of writes to memory, because the former are going to occur with more frequency than the latter, but it would be a good idea for me to perform reverse optimization on one or two words at the end of the memory space-- the 31st and 32nd bytes. Our memory's been curved around a little bit (at the top of the diagram) but it works the same.
+
    #############
 +
    ##          ##
 +
  ## ######d#### #
 +
  # ##    h^hXh#d#    Null completion (no r/w)
 +
  # #hhhhh#####^##          R/W completion
 +
  # #      ##h###
 +
  # # ########d ##
 +
  # # ddddddd^## #
 +
  ## # ########h# #
 +
  # ## ######Xh## #
 +
  #h#d ########d ##
 +
  ##^# ddddddd^## #          Write true to buffer also opens hatch two tiles north to prevent escape
 +
  #h## ########h# #
 +
  #X## #######X## #
 +
  #h## ######h### #
 +
  #^## #######^## #
 +
##hd# ddddddd#d ##
 +
#dh## ########## #
 +
  ##^## ddddddd#d ##
 +
  #h## #######^####
 +
  #X###########hX#
 +
  ###        ####
 +
 
  
 
I could make it smaller, and thus faster, by using z-levels, but I don't want to do that.  There is a lot to be said for keeping your design understandable at a glance.  By the time I'm done, even I won't understand what I'm saying or what I've done.
 
I could make it smaller, and thus faster, by using z-levels, but I don't want to do that.  There is a lot to be said for keeping your design understandable at a glance.  By the time I'm done, even I won't understand what I'm saying or what I've done.
  
One memory cell needs 2 goblins, 42 doors/hatches, and 63 mechanisms, if I counted right.  Entire memory space will thus require 64 goblins, 1312 doors/hatches, and 2016 mechanisms.
+
One memory cell needs 2 goblins, 58 doors/hatches, and 96 mechanisms, if I counted right.  Entire memory space will thus require 64 goblins, 1856 doors/hatches, and 3072 mechanisms.
  
Our memory cell is large-- 16 wide, 21 tall.  Sides are just walls.  8 of these, side by side, would take up 121 tiles.  That means that to keep a nice looking computer, we need a 3-wide embark.  In that space, I could keep 8 and expand my memory cell by 2 tiles, which would give me a whole mess of addressing bits.  A 2 square tall embark would give us enough room for 4 of these.  We might just need that-- each byte needs to be separated from the other by 2 dead z-levels, remember (could get around that with offsets, but that requires visualizing everything at once).  So we could fit our entire memory space inside 24 z-levels of one map.
+
Our memory cell is large-- 18 wide, 22 tall.  Sides are just walls.  8 of these, side by side, would take up 137 tiles.  That means that to keep a nice looking computer, we need a 3-wide embark.  A 2 square tall embark would give us enough room for 4 of these.  We might just need that-- each byte needs to be separated from the other by 2 dead z-levels, remember (could get around that with offsets, but that requires visualizing everything at once).  So we could fit our entire memory space inside 24 z-levels of one map.
  
 
I'd rather keep it subterranean.  I don't like the idea of a roc flying in and killing my memory.  I guess I could get away with bridged pitting squares instead.
 
I'd rather keep it subterranean.  I don't like the idea of a roc flying in and killing my memory.  I guess I could get away with bridged pitting squares instead.
  
Let's consider build order too.  We want any linked hatches or doors to be nearby in the build queue-- so that we can find them with a minimum of keystrokes.  In our case, we can't build address doors until we've built an address buffer, buffer T/F doors until we've built a buffer, the begin door of our reset or until we've built a cycle manager or continue doors until we've built a r/w manager, but everything else in this circuit just links to itself.
+
Let's consider build order too.  We want any linked hatches or doors to be nearby in the build queue-- so that we can find them with a minimum of keystrokes.  In our case, we can't build address doors until we've built an address buffer, buffer T/F doors until we've built a buffer, the begin door of our reset or continues until we've built a cycle manager, but everything else in this circuit just links to itself.
  
 
What about the problem of knowing when r/w is complete?  By separating our continues into r/w and null, we guarantee that each bit location will only return a single r/w signal (and 31 null signals).  We can still tell, with just eight doors.  But each door has to be linked to 32 different r/w completion points.
 
What about the problem of knowing when r/w is complete?  By separating our continues into r/w and null, we guarantee that each bit location will only return a single r/w signal (and 31 null signals).  We can still tell, with just eight doors.  But each door has to be linked to 32 different r/w completion points.
  
 
==Optimization: Memory Latency==
 
==Optimization: Memory Latency==
 +
 
So far, we've only been linking our operations to the true part of memory.  We've been using FALSE and NOT TRUE interchangeably.  But they're not quite interchangeable-- not really.  Consider the following two circuits:
 
So far, we've only been linking our operations to the true part of memory.  We've been using FALSE and NOT TRUE interchangeably.  But they're not quite interchangeable-- not really.  Consider the following two circuits:
  
Line 147: Line 154:
  
 
==Cycle Manager==
 
==Cycle Manager==
 +
 
Every cycle of our computer follows a certain flowchart.  Let's spell it out.
 
Every cycle of our computer follows a certain flowchart.  Let's spell it out.
  
Line 171: Line 179:
  
 
==Special Memory: Getting stuff done==
 
==Special Memory: Getting stuff done==
So far, our computer doesn't do anything.  I mean, it can compute anything we throw at it, but it doesn't kill goblins or irrigate our farms or anything like that.  What we need are special bytes: output and input.
+
 
 +
So far, our computer doesn't do anything.  I mean, it can compute anything we throw at it, but it doesn't kill goblins or irrigate our farms or anything like that.  What we need is a special byte: an output byte.
  
 
===Output===
 
===Output===
 +
 
Output bytes look like any other kind of byte (yeah, I'm going to stop showing you that same picture of a memory bit now.)  The only difference is that the pressure plate in the memory is linked to more than just r/w functions-- it's linked to existing devices.  For instance, a bridge could raise when the goblin moves into TRUE position and lower when the goblin moves off of TRUE position.
 
Output bytes look like any other kind of byte (yeah, I'm going to stop showing you that same picture of a memory bit now.)  The only difference is that the pressure plate in the memory is linked to more than just r/w functions-- it's linked to existing devices.  For instance, a bridge could raise when the goblin moves into TRUE position and lower when the goblin moves off of TRUE position.
  
Line 205: Line 215:
 
   jump if not b
 
   jump if not b
  
I told you-- we were going to do a lot more bit shifts than add-with-carries.
+
I told you-- we were going to do a lot more bit shifts than add with carries.
  
 +
===Input===
  
====Decimal Output====
 
By the time we're done with this computer, thinking in binary (base 2) will come to us as naturally as thinking in decimal (base 10).  So we probably don't really need to output in decimal.  But of course we want people to be impressed with our computer, and they're never as impressed by "0000100/00000001=00000010" as they are by seeing "8/2=4."
 
 
Decimal output is actually quite the problem.  We'll have a goblin with ten possible positions (0-9) for each digit, which means a ten-armed path.  But how many inputs do we need?  We'll start with the ones:
 
 
Goblin is at 0 if output(0) is 0 and output(1) is 0 and output(2) and output (3) is 0 OR if output(0) is 0 and output(1) is 1 and output(2) is 0 and output(3) is 1.  Right?  Not quite.  What if output (4) is 1?  Then the goblin would be at zero when the value was 16 or 26!  We have to examine ALL of our output bits to determine what our decimal ones value is.  What if we want a high level of precision-- using 2 bytes to store a single number that we then want outputted in decimal?  We have to examine all 16 bits!  So we have 10 arms, each of which divides with OR statements, each of which evaluates 16 operands.  Even then, it's not scalable.
 
 
There's a better way to do it: do the hard work in software.  Consider the following program:
 
 
  Given Variables: BOP(0-7), DOP100, DOP10, DOP1, Count; Constants: 1, 10, 100, 0        variables binary output, decimal output 100s, etc
 
  begin 100s loop:
 
        If 100>BOP write 0 to DOP100 and jump to end of loop
 
        Increment count
 
        Subtract 100 from BOP and write the result to BOP
 
        Return to begin 100s loop
 
  write count to DOP100
 
  write 0 to count
 
  begin 10s loop:
 
        If 10>BOP write 0 to DOP10 and jump to end of loop
 
        Increment count
 
        Subtract 10 from BOP and write the result to BOP
 
        Return to begin 10s loop
 
  write count to DOP10
 
  write 0 to count
 
  begin 1s loop:
 
        If 1>BOP write 0 to DOP1 and jump to end of loop
 
        Increment count
 
        Subtract 1 from BOP and write the result to BOP
 
        Return to begin 10s loop
 
  write count to DOP1
 
 
I didn't write it in our language-- I wrote it in functions that we know how to use already.  You can see what it does-- it writes separate decimal values to three different bytes.  The nice thing is that it's extensible to any size of decimal number.  If we have a 2-byte number, we also need a DOP1000 and a DOP10000.
 
 
It's kind of a waste of our memory to store the decimal output (for which we really need no more than 4 bits) in an entire byte.  Remember how we wrote separate instructions and data to a single byte before?  We could do that again!
 
 
left shift DOP10 4 times
 
left shift DOP1 4 times
 
right shift DOP1 4 times
 
XOR DOP10 and DOP1, write to DOP1
 
 
What we're doing is storing decimal values in hexidecimal (base 16) format.  Now, we need two digits per output byte, and we still have ten paths, but the work is much much simpler.  Here's decimal output for DOP1(0-3):
 
 
If DOP1(0)=0 AND DOP1(1)=0 AND DOP(2)=0 AND DOP(3)=0) then output 0
 
If DOP1(1)=1 AND DOP1(1)=0 AND DOP(2)=0 AND DOP(3)=0) then output 1
 
If DOP1(1)=0 AND DOP1(1)=1 AND DOP(2)=0 AND DOP(3)=0) then output 2
 
...
 
If DOP1(1)=0 AND DOP1(1)=0 AND DOP(2)=0 AND DOP(3)=1) then output 8
 
If DOP1(0)=1 AND DOP(3)=1) then output 9
 
 
Notice that output 9 is a special case.  It just clamps the value.  If it's higher than 9 (can go up to 15!) then we just output 9.  That's because we need to send this circuit decimal output from software.  Our program above should never output a value higher than 9 to any digit value.
 
 
===Input===
 
 
Input's harder.  At first, you wouldn't think so.  Just link a lever to write to true, right?  And another to write to false?
 
Input's harder.  At first, you wouldn't think so.  Just link a lever to write to true, right?  And another to write to false?
  
Line 266: Line 225:
 
What you have to do is prevent the goblin from reading (or writing) while any writing is going on.  That means that every input device (levers, but pressure plates are better) has to prevent a r/w goblin from reading it until 100 ticks after writing to the input.
 
What you have to do is prevent the goblin from reading (or writing) while any writing is going on.  That means that every input device (levers, but pressure plates are better) has to prevent a r/w goblin from reading it until 100 ticks after writing to the input.
  
Remember how I said that there was actually one extra space for an addressing door in our memory?  I wasn't trying to suggest that we build 64 bytes of memory-- that would be a monster (as if 32 isn't.)  What I was talking about was that we could use this to prevent writes to our memory immediately following a write from input.  The easiest, simplest way to do that is end our line of addressing doors with a hatch, and link anything that writes to input to that hatch.  That way, there won't be a path until 100 ticks after a write occurs.  (You still have to make sure that you don't write to it from input so fast that it changes while the goblin's in the pathway!  More on that later.)
+
More to come.
  
====Clock bits====
+
===Input/Output===
A perfect example of this is a clock.  Clocks are great input-- for instance, if we want a doomsday program to constantly check the clock until a certain time is reached, then open the magma floodgates, we need clock-input.  You never want to write to a clock.  But clocks change all the time.
 
 
 
    ###
 
  ###<#
 
  ##h#h#
 
##^#^##
 
#hd#dh#
 
##^#^##
 
#h#h##
 
#<###
 
###
 
 
 
Here we have an incrementing memory cell.  It's the perfect design for a clock.  Somewhere, we have a repeater toggling this memory, any any incrementation ripples through the next bits.  If clock(0) toggles every 200 ticks-- well, if clock(0) toggles every 200 ticks, then by the time we've run a single instruction, clock(0) won't be the same anymore.  But if clock(0) toggles every 1200 ticks (that is, daily), we want to be able to grab a correct value-- not yesterday's value, not tomorrow's value, not some random value.  I mean, what if clock(0) has toggled when we read it, but it has created a carry that hasn't yet percolated through clock(7) when we read it?  It'll look like we went back in time.
 
 
 
Maybe we can link that hatch in our addressing circuit to the same thing that toggles this memory?
 
 
 
No.  We can't.  And it's a perfect demonstration of why we have our basic rules (remember them?)  This makes it so that we have a goblin with spaces to move on, but with no path.  What if the addressing goblin is right on the hatch when the clock rolls over?  He drops, and the program stops, the whole computer stops, waiting for memory read to complete.  (Maybe he kills some of our dwarves too.  But our computer!)
 
 
 
In the case of the clock, what we want to do is block access to the entire circuit whenever a clock updates.  That includes clock(7), which might be incrementing much later (let's be conservative, and say 1400 ticks later).  We're going to institute a guard for the begin door, which is normally triggered whenever it's time to read or write to memory.
 
 
 
####
 
rddc
 
####
 
 
 
Leftmost door is linked to the thing that normally says that writes are allowed to memory.  Rightmost door is linked to our actual write to clock.  And continuation leads to an iterated delay circuit.  (This breaks our rules, but again, since the doors will never close until continuation is reached, only open, it's okay.)
 
 
 
### ###
 
rhc rhc
 
### ###
 
 
 
Continuation of clock guard trips the hatch; 100 ticks later, the 1st delay goblin runs to continuation of delay1, which trips the hatch on delay2; 100 ticks later, delay2 goblins runs over our second continuation, which opens the door to clock(0).  We'll need to institute further iterated delays to read clock(2-7).  Note that we only institute iterated delays on later clock reads-- if we delay 1600 ticks to read clock(0) it will have changed again already!  We can do this, because we have separate access doors to read each of the seven bits.
 
 
 
This is bad.  It means reading the clock is slow.  First, we have to wait for the clock to change; then, we have to wait for all possible changes to percolate.  (We could build a look-ahead unit for our clock to make this faster, but let's not go there.)  Can this be optimized?  Probably.  Should it be optimized?  A dwarf has only so much energy to devote to problems.  (In other words, I don't yet know what the ideal solution is.)
 
 
 
This is also not easily generalizable-- that is, it's specific to clock bits, which we know will change sometime.  But we can't just sit around waiting to see if you're going to throw a lever.  You might never!  We don't know if you're ever going to throw that lever or not.  Do we?
 
 
 
We do-- so long as we can institute sufficient delay in your lever throw, and execute reads/writes first before executing your lever throw.  What we do here is wait for our computer to complete a r/w cycle, then
 
execute your lever throw, in the time between possible r/w.  (There'll be plenty of time.  Believe me.)  This means instituting a similar delay, with a similar manager, to writes to input from lever (or, preferably, pressure plate, because we want everything to reset automatically), rather than to reads from input by computer.  The mechanisms are the same.
 
 
 
=====The Doomsday Program=====
 
set time_to_die = clock
 
set time_to_die = time_to_die + delay
 
begin early loop
 
write clock to original_clock
 
if time_to_die < clock then:
 
  begin early loop:
 
  read clock
 
  if clock < original_clock go to late loop
 
  go to begin early loop
 
begin late loop
 
  read clock
 
  if time_to_die < clock then XOR output with 10000000
 
  go to begin late loop
 
 
 
Why are there two checks to see if the clock has passed ragnarok date?  Because the clock, and our values, can roll over.  What if our delay is 01000000 and our clock is at 11000000?  Time to die will be at 00000000, and since that's less than clock at program start, we'll have a premature armageddon.  Instead, we have to wait for our clock to roll over, which we verify by seeing that clock is at a lower value than it was at the beginning of evaluation.  [What's output(7) do?  Use your imagination.  I prefer to think that it releases a troll into main memory.)
 
 
 
=====The Clock=====
 
You might notice that I never even said what's driving our clock.  That's because it doesn't matter.  What I'm going to use is something like the following:
 
 
 
###        ###
 
#X#        #X#
 
#h##########h#
 
##^d      h^##
 
  #h########d#
 
  # #      # #
 
  # #      # #
 
  # #      # #
 
  # #      # #
 
  # #      # #
 
  # #      # #
 
  #d########h#
 
##^h      d^##
 
#h##########h#
 
#X#        #X#
 
###        ###
 
 
 
It won't run perfectly, because goblins move probabilistically, but over the long term, that will average out.  The goblin running it will also slow as time passes and his attributes rust, but eventually that will bottom out as well.  So we won't know that this clock is moving predictably until a long time has passed-- after that, we see how many iterations it covers in a year (think of how useful our computer will be for that task!) and then do the math, and after that, we know what rate our clock is moving at.
 
 
 
====Decimal Input====
 
Every once in a while, you might want decimal input.  It's hard to imagine why-- every possible means of inputting in DF is limited to binary-- but you might want to do it nonetheless, perhaps by a series of ten levers, each of which inputs a base 10 number from 0-9.
 
 
 
This is simple task-- at least it is, now that we've built a tool set.  We have 10 bits of input (distributed through at least 2 bytes), each of which refers to a single digit; all of our input both writes true to the appropriate bit and writes false to the inappropriate bits.  All we need to do is multiply the digits with an exponent, then add them to zero.  Consider a calculator, where you enter a sequence of bits from 1s to whatevers-- the first number you enter will be the leftmost (most significant), but how big it is depends on how many digits you enter.  Here's what a program reading input delivered in that style would look like:
 
 
 
If digit(0) and digit(1) jump to end
 
If digit(0) then set BIP=00000000
 
If digit(1) then set BIP=00000001
 
.....
 
Set digit=00000011 (make sure our computer knows no further input has been generated)
 
Begin exponentiation:
 
  If digit(1) XOR digit(0)  else jump to this instruction
 
  Multiply BIP*10, output to BIP
 
  If digit(1) add 1 to BIP
 
  If digit(2) add 2 to BIP
 
  .....
 
  Set digit=00000011
 
  Go to begin exponentiation
 
  
What this does is wait for you to input a digit-- every time you do so, it considers that digit as an extra base 10 digit to an existing binary number.  All this program would actually do is let you input a number-- you'd need another input (like an add key, for instance) to get the program to do anything else.  This program would also overflow very easily-- anything greater than 255 would overflow.  You might want to use a multi-word variable for this program.
 
 
===Input/Output===
 
 
You might wonder how you can build a bridge that you can operate with a lever, but that still plays nice with your computer-- so that you can open it or close it, but your computer can too.  This is input/output.  Instead of linking your lever directly to the bridge, you link it to an input bit.  Then, you treat that input bit as an output bit, and link it to your bridge.  Now your computer can operate the bridge, you can operate the bridge, and your computer can even tell if you pulled the lever while it wasn't looking.
 
You might wonder how you can build a bridge that you can operate with a lever, but that still plays nice with your computer-- so that you can open it or close it, but your computer can too.  This is input/output.  Instead of linking your lever directly to the bridge, you link it to an input bit.  Then, you treat that input bit as an output bit, and link it to your bridge.  Now your computer can operate the bridge, you can operate the bridge, and your computer can even tell if you pulled the lever while it wasn't looking.
  
The problem with this is that not only do you have to prevent the computer from reading (or, now, writing to) the bit while you're writing to it-- the computer also has to prevent you from writing to the bit until after it's done writing to it.  We can't fall back on to read-only memory to get around the problem like we did with the clock.  The computer can't wait around to see if you're going to throw the lever, because you might be waiting around to see if the computer's going to throw the lever.  How are we going to get around this?
+
The problem with this is that not only do you have to prevent the computer from reading the bit while you're writing to it-- the computer also has to prevent you from writing to the bit until after it's done writing to it.  Ideally, you'd maintain a sort of stack of actions that you could execute, so that even if you pulled the lever while the computer was writing, your lever pull would go through as soon as it could.
 
 
We're going to get around it because it's not a real problem.  The computer isn't ever sure that you're going to provide input on a subject, but you're sure that the computer is going to provide input on some subject.  That means that there is a specific time that you can provide input: after the computer has read memory to buffer, but before the computer has started to execute any instructions (some of which may include writes to i/o.)
 
 
 
What this means is that we need to institute a delay following memory read-- sufficient delay for input to be written.  This can be managed by our cycle manager.  All we need is 200 ticks for our computer to r/w, followed by 200 ticks for your input to r/w.  Only eight hours, between the two!
 
 
 
Let's do it a little differently.  If we can institute that delay only when writing to i/o memory, we can chop memory access times in half.  That means specifying which bits are i/o, and instituting a delay on "write from register a" or "write from register b" operations dependent on what byte we're writing to.  Let's designate byte 29 as i/o. (30 and 31 are designated variables, because they're faster to write to than to read from.)
 
  
Of course, that's kind of limiting. We might want use i/o on more than 8 devices.  Have you considered that most programs want to work on all bridges, or all floodgates, or all doors?  We need more than one byte of i/o, don't we?
+
More to come.
  
 
===Feedback: Memory multiplexing===
 
===Feedback: Memory multiplexing===
We do.  But there's one more really cool thing we could with i/o.  It doesn't even involve creating any latency.
 
 
We can make a memory cell that outputs to computer functioning.
 
 
We've already demonstrated how to limit access to memory cells without changing the addressing bits.  That same function could be used to allow access to one bit of memory (while denying access to another).  What if we designed an output bit-- that outputted to which memory bits were addressed?  Perhaps via that space that we have for another door or hatch in our latest memory design?  Perhaps by more manager circuits instead?  (Use the managers.)
 
 
Yes, we can do this.  It's called multiplexing.
 
 
We could have 256 banks of memory, each 31 bytes large (well, maybe we could do it on a 16x16 embark) and switch between them with a multiplexer.  All it is an output byte that controls the action of doors allowing memory r/w functions.
 
 
Why 31 and not 32?  Well, because as soon as we multiplex, we no longer have access to any of the memory that we multiplexed out of.  Don't we need to be able to multiplex back to it?  So let's keep our multiplexer as shared memory, between all 256 banks.  That way, all we have to do is write to our multiplexer to return us to the previous bank of memory.
 
 
Why do we even have 5 bits of addressing if we can do this?  There's a serious disadvantage to multiplexing: we can't easily carry variables between multiplexed addresses.  For instance, when we move from bank 00000000 to 00000001, we can only drag one piece of data with us (we have two registers, and one of them has to contain the value that we write to the multiplexer to change memory spaces.)  If we try to read the value in 11111111, we'll grab the data in bank 00000001 instead of in bank 00000000, which aren't going to be the same thing.
 
 
Since 256 banks of memory would take up a huge space (but oh god, think of what we could do with 8kb, who could ever need more than that?), we're not going to do that.  Instead, we're only going to multiplex one byte: our output byte.
 
 
By multiplexing, we can specify which one of 256 output bytes to write to.  We can have 256 banks, each consisting of 8 devices, that we can toggle, turn on, turn off-- whatever.  All we have to do is write to our multiplexer to specify to which output byte we want to write.  No, we don't have to use all 256!
 
 
Prefer input bytes?  I can't imagine why, but multiplex those instead!  Hell, we could even have a split multiplex.  By multiplexing a few data cells as well, we could maintain a large bank of variables to pass between memory spaces, while still keeping sufficient multiplexed instruction space.
 
  
And yes-- you can multiplex your multiplexers.  256 banks of 256 banks of 30 bytes (2 megabytes, we've reached the Macintosh)....  You could multiplex your operations codes.  You could multiplex your registers.  You can use your computer to change how your computer behaves.  That's what a Turing machine is all about.
+
More to come.

Please note that all contributions to Dwarf Fortress Wiki are considered to be released under the GFDL & MIT (see Dwarf Fortress Wiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel Editing help (opens in new window)