About

想要把之前没有写完的 Chip-8 模拟器给写完 (其实已经写完了, 但是现在来看旧的做法并不是很好, 也没法拓展应用到其他的模拟器).

恰好之前在 Reddit 上看到了这篇文章的介绍, 想着不如试试.

This paper summarizes the set of tools and design approaches used in the development of the chip. Where possible we will show how Lisp was used as a specification or verification lanaguage.

注: 本文不是翻译. 只是阅读笔记以及可能可以学习的语法和设计的参考.

注: 原始的论文 (也有可能是会议报告就是了) 扫描得有些差劲. 不过可以在 webarchieve 上搞到.

注: 原始的论文里面用到了 Symbolics Lisp Machine 上的 Flavor, 而非现在 CL 中的 CLOS 作为 OOP, 稍微有些不同但是应该可以分辨.

Architectural Design

Architectural Simulation

This level of design is used to verify the virtual machine architecture and gather architectural statistics.

(defemulator add operand
  (multiple-value-bind
        (first-operand second-operand)
      (fetch-two-operands operand)
    (stack-push (+ first-operand second-operand))))

这部分的语法感觉可以学习, 用来作为模拟器的编写估计会非常有效.

Behavioral Simulation

Each module defines a set of I/O signals that constitute a module's communication ports. These specify the ports to be used in the circuit design.

(defmodule (adder ivory)
  :local-state ()
  :local-phase-1-registers ()
  :local-phase-2-registers ()
  :phase-1-registers ()
  :pahse-2-registers ()
  :phase-1-inputs    ()
  :phase-1-outputs   ()
  :pahse-2-inputs    (op1 op2)
  :phase-2-outputs   (external-bus)
  )

有点类似于 Verilog 中的

module adder (input op1, op2, output external-bus);

这样的感觉.

Each class has a method for simulating the behavior of a module during the first phase of the clock, another for simulating the second phase, and other for updating the window-oriented display after each phase.

(defaction (adder :execute-phase-2)
  (let* ((unsigned-op1 (32-bit op1))
         (unsigned-op2 (32-bit op2))
         (unsigned-result (+ unsigned-op1 unsigned-op2)))
    (setq external-bus unsigned-result)))

(defaction (adder :display-phase-2)
  (format window " op1 ~D ~& op2 ~D" op1 op2)
  (format window " ~& external-bus ~D" external-bus))

相当于运行模块 (execute-phase-2), 并显示模块输出 (display-phase-2), 感觉类似于编写对模块模拟/测试用的代码.

An assembler supports the development of the microcode module. The following represents the microcode specification for the ADD instruction:

(definstruction add
  (parallel
   (check-arithmetic-operands operand-1 operand-2)
   (pop2push (+ operand-1 operand-2))
   (enable-overflow-exception)
   (next-instruction)))

Structural Design

(def-std-cell-schematic ("simple-example"
                         :inputs  (p q a b)
                         :outputs (z w))
  (setq z (if (and p q)
              a
              b))
  (setq w (not (and a b) (or p q))))

When compiled, the logic is simplified and a rule-based technology selector optimizes the circuit used by choosing gates, merging gates and eliminating unnecessary inverters.

这部分类似于把 Lisp 表达式变成逻辑门 (这里的 setq 是会被视为 Verilog 中的 assign 还是单纯的视为 wire? 从 Schematic Editor Interface 的截图里面看起来像是单纯的 wire 就是了).

A simple pattern matching language is used to apply the rules. The rules are augmented as different situations arise during the course of the design.

(define-eqn-transformer AOI (NOR (AND b c d) a)
  (AND3-NOR2 a b c d))

(define-eqn-transformer AOI (NOR (NOR a b)
                                 (NOR c d))
  (OR2-OR2-AND2 a b c d))

(define-eqn-transformer AOI (AND (NOT (NOR a b))
                                 (NOT (NOR c d)))
  (OR2-OR2-AND2 a b c d))

这个思路感觉有点暴力, 但是估计确实还挺有效的? 编译之后用于减少逻辑门数量, (把多个逻辑门用一个标准单元实现代替) 以更有效地利用区域.

感觉也不是很难实现
;; -*- Package: ryo -*-

(defun non-conflicts-alist-union (alist &rest more-alists)
  "Merge alists.
If conflicts, return `nil'. "
  (let ((alist-union (copy-list alist))
        (conflicts-p nil))
    (loop for alist in more-alists
          do (loop for (var . val) in alist
                   for exisiting-val = (cdr (assoc var alist-union))
                   do (if exisiting-val
                          (unless (equal val exisiting-val)
                            (setf conflicts-p t))
                          (push (cons var val) alist-union)))
          if conflicts-p
            do (return NIL)
          finally (return alist-union))))

(defun pattern-match (pattern s-expr)
  "Match `s-expr' with `pattern'.
Return alist for (var . expr) or `nil' for not matched.

Pattern matching rules:
* If `pattern' is atom (stands for a single `var' name):
  the `var' will be binded with value `s-expr'
* If `pattern' is list (stands for a expression):
  will test if `s-expr' has the same list root,
  if same, try to match the rest arguments with rest `pattern'
"
  (cond ((atom pattern) (list (cons pattern s-expr)))
        ((atom s-expr)  nil)
        ((and (eq (first pattern) (first s-expr))   ; if `s-expr' has the same list root
              (= (length pattern) (length s-expr))) ; and try to match the rest arugments
         (let ((matches (mapcar #'pattern-match
                                (rest pattern) (rest s-expr))))
           (assert (notevery #'null matches))
           (apply #'non-conflicts-alist-union matches)))
        (t nil)))

(defun alist-leaves-subst (alist tree)
  "Substitude `tree' leaves with `alist'.

* If tree is atom (leaf), try to replace if tree is in alist;
* If tree is list (tree), replace only leaves (keep root same)"
  (if (atom tree)
      (or (cdr (assoc tree alist)) tree)
      (cons (first tree)
            (mapcar (lambda (node) (alist-subst alist node))
                    (rest tree)))))

(defun pattern-match-replace (pattern replace-pattern s-expr)
  "If `s-expr' matches `pattern', replace with `replace-pattern'.
Return `s-expr' itself if not matches.

Example:
+ if `pattern' is (fn a b c), `s-expr' is (+ 1 2 3),
  `replace-pattern' is (- a b c), return (- 1 2 3)
"
  (let ((match (pattern-match pattern s-expr)))
    (if match
        (values (alist-leaves-subst match replace-pattern) t)
        (values s-expr nil))))

(defun pattern-match-replace-all (pattern replace-pattern s-expr)
  "Try to replace `s-expr' recursively. "
  (multiple-value-bind (new-expr match-p)
      (pattern-match-replace pattern replace-pattern s-expr)
    (if match-p
        (values new-expr t)
        (if (atom s-expr)
            (values s-expr nil)
            (loop with matched = nil
                  with new-expr = ()
                  for expr in s-expr
                  do (multiple-value-bind (expr match-p)
                         (pattern-match-replace-all pattern replace-pattern expr)
                       (push expr new-expr)
                       (when match-p (setf matched t)))
                  finally (return (values (nreverse new-expr) matched)))))))

注: 这里的 pattern-match-replace-all 其实还有些问题:

  1. 无法化简到最简, 不过这个可以通过加一个循环来解决:
    (defun pattern-match-replace-all! (pattern replace-pattern s-expr)
      "Replace all. "
      (loop for (expr match-p) = (list s-expr t)
              then (multiple-value-list
                    (pattern-match-replace-all pattern replace-pattern expr))
            while match-p
            finally (return expr)))
    

    不过可以注意到这个是一个正则序的应用, 理论上来说也可以变成应用序. 某种程度上来说是否也算是一种更加变态的宏呢?

  2. 匹配过于简单了, 如果能再带上带参数的匹配, 类似正则表达式的匹配, 以及根据匹配来展开的估计就更帅了.

同理, 除了简化逻辑门, 还有电路加速 (recognize circuit speed-up rules) 的匹配规则:

(define-eqn-transformer INVERSE (XOR a (NOT b))
  (XNOR a b))

以及用于简化布线占用 (reducing channel routing occupancy) 的替换规则:

(define-eqn-transformer IMPLODE (NOT (NAND a b))
  (AND a b))

更底层的 OOP 数据库的操作:

(defmethod (internal-capacitance rsim-network-mixin) ()
  (loop for node in nodes
        unless (or (eq node gnd)
                   (eq node vdd))
          sum (node-capacitancne node)))

Physical Design

前面做得有点类似 KiCAD 中的 Scheme 的工作, 下面的类似于 Layout 的工作.

A standard cell layout system automatically generates symbolic layout from control schematic diagrams with the option of using port locations specified by the mask-outline. Both min-cut and thermal-annealing approaches have been used…

Data paths are constructed manualy. Basic cells such as registers, muxes and addres are provided in a data-path standard cell library. To improve generator horizontally abuts three cells and raises the instance ports to this level of the hierarchy.

(defaspect-generator (data-path :virtual-grid) (flag)
  (HORIZONTALLY-ABUT 'module-a
                     (if flag
                         'module-b
                         'module-d)
                     'module-c)
  (import-ports-on-edges))

相当于定义自动布线规则, 然后将一些手动布线的组件作为标准库.

When all modules have been designed symblically and compacted, the NS interactive editor is used to specify a slicing style floorplan. Using the connectivity of the corresponding schematic, this floorplan is used as the basis to automatically place and route the entire chip. A global router first assigns nets to the routing channels. When this is complete, modules are composed according to the floorplan composition ordering. As they are connected, power ground and clocks are also routed. This chip composition takes 2 hours to run for the complete Ivory chip. At early stages of design, partial floorplans can be constructed using estimates of block sizes. The following specifies the “expample” module which has an estimated size of 250u by 300u and has the inputs entering on the top and the outputs exiting on the bottom.

(def-mask-outline example (250 300)
  (:top "s<3:0>"
        memory-write-pending
        bus-master-pin)
  (:bottom mcw mcr mcrw))

类似于通过给提示来加速 floorplan 布线.

A network comparison program is able to compare any two extracted networks (i.e. from virtual-grid layouts, schematics, mask layouts and vendor net-list files). Interactive feedback is provided to identify suspicious nodes. No node names are necessary.

A fast interactive DRC is provided for finaly mask artwork checks.

这部分的工作流程应该是可以固定下来, 其中的 floorplan 的算法估计有更新的迭代. 鉴于我不可能接触到门电路级别的芯片 (最多 FPGA 吧? ), 所以感觉如果真闲的话, 可以考虑用这种方式去替代/学习几个开源的 router 程序? (虽然感觉意义不大)

Simulatioin

Circuit Simulation

A switch-level simulator with timing (RSIM) was used to bridge both the gate and switch level circuit simulation requirements. Apart from being optimized for fast simulation, our version of RSIM has the ablility to specify functional models in the following manner (a RAM):

(deffunctional-model cache-memory
  :inputs ("addr<6:0>"
           -row-enable
           write
           "write-data<39:0>")
  :outputs (("data<39:0>" :pd-size 8/1 :pu-size 16/1))
  :local-state ((cache-array
                 :initform
                 (make-array 128 :initial-element 'x)))
  :delays ((row-enable↓→data :delay 25))
  :timing-constraints ((addr-row-enable :setup 16)
                       (write-datawrite :setup 15))
  :model (cond ((eql addr 'x) (setq data 'x))
               ((eql write 1) (setf (aref cache-array addr)
                                    write-data))
               ((eql write 0) (setq data (aref cache-array addr)))))

这部分做得有点像是手工编写的逻辑, 而不是很像一个仿真的结果? 对 SPICE 并不是很了解, 可能之后可以去阅读一下 SPICE 的实现和算法. 在这里 (NS) 的模拟中, SPICE 干的活是通过一个 lisp interface 做电路模拟.

后注: 这里的 functional-level 和后面的 switch-level 可以进行一个区分, functional-level 有点类似于 “解析解” 电压上升和下降的模式都用一个简单的模型表示; 而 switch-level 则是在门电路级别上进行一个模拟仿真 (更加耗时, 但更准确)

Access to the RSIM simulator is available in parallel either via Lip code that can set, read and compare values on a circuit node, or via mouse clicks on a schematic displayed in the graphics editor window. A hierarchical schematica can be traversed using PUSH/POP commands… Test programs writtern in Lisp use a protocol consisting of three generic functions:

  • value, which returns the value of a node,
  • set-value, which sets the value of a node, and
  • sim-step, which propagates all changes through the network until no further changes occur. Optional arguments to sim-step can specify the length of the simulation period
一段论文里面凡尔赛的话

To give some idea of the extensibility of NS, an exerimental timing simulator mode based on backward Eular intergration was added to NS in a matter of morning by a designer. We intend to incorporate parallel fault simulation into RSIM in the future, but it will probably take more than a morning.

哼, backward Eular 我也可以用一个早上 (大概是晚上) 实现. 只是估计没法保证在这么大的一个项目里面应用吧…

背后应该是一个解多元线性微分方程组的活 (简单看了一下 Kielkowski Inside SPICE第二章, 更多细节并没有深入), 那么用 backward Eular 作为求解器感觉很合理.

Hardware Simulation Acceleration

简而言之就是 RSIM 在仿真大规模的芯片 (门电路) 的时候太慢了:

operating one module at a time at the switch level while the others operate at the functional level

加速方法就是混用 functional level 和 switch level 的模拟.

Design Verification

Functional Comparison

思路就是对 behavioral simulator 和 RSIM 同时进行应用, 并对结果进行比较 (on a signal by signal basis as the bahavioral simulator is used to specify the modularity and communication between modules of the chip).

During verification, set-value message are passed to both the behavioral simulator and RSIM. Each particular simulator takes the appropriate action to set internal nodes to a particular value. This is achieved using a “forwarding network” which takes a list of two networks, on the RSIM network and the other the behavioral simulator “network” and the node under question and successively applies the set-value procedure to both nodes in each simulator. Thus the setv function is used to set values in both networks:

(defmethod (setv forwarding-network) (node value)
  (set-value rsim-network node value)
  (set-value behavioral-module node value))

As the functional simulator works with a two-phase clock, a function is written to emulate the clocks for the RSIM simulation. A simple version of this would look as follows:

(defmethod (simulate-phase-1 rsim-network-mixin) ()
  (set-value self 'ph1 1)
  (sim-step self)
  (set-value self 'ph1 0)
  (sim-step self))

This message is applied to both networks to advance through a phasel clock cycle. The following forwarding network function calls both phasel execution functions:

(defmethod (phase1 forwarding-network) ()
  (simulate-phase-1 rsim-network)
  (simulate-phase-1 bahavioral-model))

The verify command operates by asking the two simulators for values and then comparing the results. If the results disagree, a debugging session with the user is initiated.

(declare-bus 'op1 32)
(declare-bus 'op2 32)
(declare-bus 'external-bus 32)

(defun compare-adder-ops (op1 op2)
  (phase1)
  (setv 'op1 op1)
  (setv 'op2 op2)
  (phase2)
  (verify 'external-bus))

Such programs are written by designers to test the functionality of individual modules.

To provide higher level tests, a spy strategy was developed. With this facility, the complete behavioral simulator could be exercised by Lisp test programs. Arbitrary collections of modules can be grouped together (to model physical layout groupings) and their colective inputs and outputs monitored to provide a trace history of the boundary signals. The storage of the history allows both interactive and batch simulation. RSIM simulation code consisting of setv and verify statements, is generated from the history. This code was then applied directly to the extracted network with any discrepancies detected by verify errors.

The spy code was extended to allow interfacing to an engineering tester. This allows interactive debugging of tests in an engineering environment that was closely linked to the program development environment of the Lisp machine.

这部分感觉思路很合理, 就是实现起来感觉可能并不会像这里说的那么简洁? 基本的几个 method 的实现感觉可以这样做, 如果想要拓展的话, 不清楚是通过 :after :before 这样的方式去做 patch 还是重新写逻辑.

一个感觉可以学习的点

正好我现在也要做模拟和实验的比对, 这种验证方式不清楚是否可以学习一下.

Timing Analysis

A timing analyzer based on finding critical paths through transistor networks was implemented based on Crystal.

Version Control

后记

感觉可以学的:

  • defemulator (用来重构 Chip8 模拟器)
  • defmodule (用来和实验, 此处为 RSIM 进行检验)
  • definstruction

感觉有时间可以折腾的:

  • SPICE 的模拟以及和 deffunctional-model 的验证和比对
  • layout 和 router (感觉意义不大)

感觉差不多已经实现的:

  • define-eqn-transformer 模式匹配替换的感觉没问题了
  • 实际上配合很早之前的逻辑门生成 (大概是在数电或者是 Turing Complete 那边), 加上这个模式匹配替换规则, 应该来说可以做到 伪 HDL 代码到门电路的工作

一些奇怪的点子:

  • 之前做的 节点编辑器, 之前的想法是能够做成用于 iGEM 的 BioBlock 的编辑器. 看到这个之后感觉其实可以迁移一下应用这个逻辑来进行一些拓展:
    (def-std-bioblock-part ("promopter"
                            :enable ((a :by "Ara h 1"))
                            :next (b))
      ;; some behavior code
      (if a
          (enable  b)
          (disable b)))