Utility Learning
The Utility Theory
- noise is controlled by utility noise parameter :egs
-
noise is a logistic distribution with a mean of 0 and a variance of
-
If multiple production matches, then the probability to fire is :
-
implemented in program :chose the one with highest utility
Building Sticks Example
- we have unlimited building sticks of three lengths
- goal : create a target stick of a particular length
- 2 basic strategy:
- undershoot : start with a smaller one : add others
- overshoot : start with a “too long” one ,then saw off others
- The cognitive model doing the bst task:
- over = abs(G - B)
- under = abs(G - C)
- chose the one that is more closer ( 25 more closer than the other)
- if there is a clear difference , than there will be 3 productions can fire:
- decide-under / decide-over
- force-under
- force-over
Utility Learning
- if enabled utility learning, the utility will be updated as the model runs based on rewards
-
-
: learning rate , default = 0.2 :alpha
-
time is from production selected to reward received.
- give less reward to more distant productions
- the reinforcement goes back to all of the productions which have been selected between the current reward and the previous reward.
-
- ways to provide reward:
- attaching rewards
- use trigger-reward command (See CodeUnit6)
- rewards will be applied after the corresponding production fires
-
(spp read-done :reward 20)
-
(spp pick-another-strategy :reward 0)
- consider the following situation:
-
… –> pick-another-strategy –> p1 –> p2 –> p3 –> read-done
- in this case , only p1 p2 p3 which fire after pick-another-strategy will receive rewards
- read-done will receive its own reward
- pick-another-strategy won’t receive any reward
- production before pick-another-strategy will receive negative rewards
-
- attaching rewards
Learning in the Building Sticks Task
- enable utility learning :ut
- turns on the utility learning trace :ult
Additional Chunk-type Capabilities
- Default chunk-type slot values
-
+visual> isa move-attention ;// is equivalent to "cmd move-attention" 因为cmd的默认值就是move-attention screen-pos =c
- 使用isa关键字,不仅能够定义chunk type ,同时也能创建chunk(通过使用默认值)
- 没有默认值 : (chunk-type _name _slot1 _slot2)
- 给特定slot设定默认值: (chunk-type _name ( _slot1 1) ( _slot2 2))
-
- Chunk-type hierarchy 面向對象 继承制 (subtype)
- use (:include typeA)
- line is a subtype of visual-object
- subtype contain all the slots that its parent has
- can contain additional slots
- can use different default value for some slots
- 一个子type能够继承多个父type
-
(chunk-type (d (:include a) (:include b)) slot4) d 继承a和b
- the parent types also gain access to all of the slots from their children.
- 当使用父type来创建chunk的时候,也能够使用子type里的属性,比如:
-
(define-chunks (isa visual-object line_start 1 line_end 2) )
- line_start 和 line_end 是 子type “line”的属性