27
Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Means-ends analysis

Northwestern UniversityCS 395 Behavior-Based Robotics

Ian Horswill

Page 2: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Review

Robot operates in an environment State space S Set of possible motor outputs A Dynamics (physics) that determines how the

environment changes state Continuous dynamics (continuous-time actions)

f = dS/dt: SA S,

f = d2S/dt2: SA S, etc. Discrete dynamics (atomic/ballistic actions, discrete time)

: SA S

Page 3: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

ReviewWant to construct a policy to make the robot do

the right thing p: S A Complete environment-robot system evolves

Continuous case: curve through state-spaceds/dt = f(s,p(s))

Discrete case: system evolves through series of states s0

s1 = (s0, p(s0)) s2 = (s1, p(s1)) = ((s0, p(s0)), p((s0, p(s0)))) Etc.

Page 4: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Error feedback control Goal state sg

Control action computed from error ds/dt = f(s- sg) d2s/dt2 = f(s- sg)

Linear feedback control f is a linear operator ds/dt = k(s- sg)

P control (proportional control) k is a gain you multiply by k is a matrix when s is a vector

d2s/dt2 = kp(s- sg)+ kd ds/dt + ki ∫s dt PID control

Page 5: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Behavior-based control(“Bottom-up”)Combine policies by running them in parallel

Behavior = policy + trigger Bottom-up integration of behaviors

Map several behaviors to a single composite behavior (or composite policy)

Several different composition operators Behavior-or (prioritization/subsumption) Behavior-+ (motor schemas/potential fields) Behavior-max Weighted voting Etc.

Page 6: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Plan-based control(“Top-down”)Combine policies by running them serially

Behaviors → atomic actions Still policy + activation level Externally triggered Self-terminating

Combine behaviors using serial controllers (plans) Finite state machines Individual states can

Trigger actions Wait for them to terminate Wait for other external conditions Etc.

Page 7: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Planning-based control(“Top-down”)Combine policies “non-deterministically”

Idea: “guess the action that will ultimately work” i.e. guess the one that leads to the goal

Problem: this doesn’t help much Don’t know which action(s) will ultimately work If you guess wrong, you’re screwed

Solution: simulation Run actions in simulation Search through possible sequences of actions (plans) to

find one that works and remember it Execute the successful plan in the real world

Page 8: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Logic-based representations of the state spaceRepresent states using propositions

(true/false statements) Find a set of propositions that let you

distinguish all the states you care about State = truth of each proposition

Advantage: partial state descriptions P^Q is the set of states in which both P

and Q are true

Page 9: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Means-ends analysis Pair goals up with actions

For each proposition, keep track of the actions that can make it true

For each action, write the precondition (partial state descriptions) for being able to run it

To solve the goal P^Q Look up the action A that achieves P Recursively solve precondition(A) Run A Recursively solve Q without “clobbering” P

Page 10: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

GPS (Newell and Simon) “General Problem Solver” Used means-ends analysis Assumed priority ordering on propositions Algorithm:

GPS(goal)while goal not yet true p = highest priority unsatisfied subgoal (subgoal = proposition inside of goal) a = action to solve p GPS(precondition(a)) do a

Page 11: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

The STRIPS representation

Define actions in terms of Add list: propositions the action makes true Delete list: propositions the action may

make false Precondition list: propositions that must be

true in order to run the action

Page 12: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Planning with STRIPS Goal = set of propositions to make true Algorithm:

STRIPS(initial, goal)for each subgoal p in goal not in initial for each action a with p in its add list try the plan: STRIPS(initial, precondition(a)) a STRIPS(initial-delete_list(a) +add_list(a), goal) if both recursive calls worked, we win else, try another action, or another subgoal

Page 13: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Reactive planning in fully reactive systems

Collection of behaviors Each behavior achieves some goal(s) Each behavior has some precondition(s)

Higher level system drives some goal signal Goal signals

Activate behaviors that achieve them Drive goal signals of preconditions

Examples:GAPPS (Kaelbling 90), Behavior Networks (Maes 90)

Page 14: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

GRL example Extended STRIPS

representation Operators are just

behaviors that activate themselves when they can achieve a goal.

(define-operator name motor-vector (achieves add-list-signals …) (clobbers delete-list-signals …) (preconditions precondition-signals ...) (serial-preconditions precondition-signals ...) (required-resources names ...) (priority number))

Page 15: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing activation-levels A behavior is runnable if all its preconditions

are satisfied It is desirable if

It satisfies a maintenance goal It satisfies some unsatisfied goal of achievement

It is unconflicted if It doesn’t clobber a satisfied goal or a

maintenance goal, and None of its resources are required by desirable

operators of higher priority

Page 16: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Compile-time property lists

(define-signal-property (add-list x) '())

(define-signal-property (delete-list x) '())

(define-signal-property (preconditions x) '())

(define-signal-property (serial-preconditions x) '())

(define-signal-property (priority x) 0)

(define-signal-property (required-resources x) '())

Page 17: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Making the behavior

(letrec ((the-behavior (behavior (run? the-behavior) motor-vector-signal))) the-behavior)

(define-signal (run? x) (and (desirable? x) (runnable? x) (not (conflicted? x))))

Page 18: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing desirable?

(define-signal-modality (desirable? x) (define adds (add-list x))

(signal-expression (or (apply or (unsatisfied-goal adds)) (apply or (maintain-goal adds))) (drives (goal (runnable? x))) (operator x)))

Page 19: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing runnable?

(define-signal-modality (runnable? x) (signal-expression (parallel-and (apply parallel-and (preconditions x)) (apply serial-and (serial-preconditions x))

Page 20: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Gatherer functions Accumulators can be declared with a gatherer The gatherer is called by the compiler with a list of

all signals being compiled It returns the signals that should be used as inputs

to the accumulator Gatherers are called after signal expansion They’re only passed the list of primitive

signals into which calls to signal procedures have been expanded

Page 21: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing conflicted?

(define-signal-procedure (conflicted? x) (define my-priority (priority x)) (let ((high-priority-clobbered-goals (filter (lambda (g)

(>= (priority g) my-priority)) (delete-list x))))

(signal-expression (apply or (accumulate or (make-conflict-set-gatherer x))

(satisfied-goal ,high-priority-c-goals)))))

Page 22: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing conflicted?

;;; A signal is conflicted if some other higher priority signal needs;;; one of its resources or if it clobbers a goal we’ve already achieved. ;;; Since confliced? already checked for clobbering, we only need to;;; worry about finding operators that need our resources.(define (make-conflict-set-gatherer me) (lambda (ignore signal-list) (define resources (required-resources me)) (define (desired-resource? r) (memq r resources)) (define (steals-my-resource? op) (any desired-resource? (required-resources op))) … ))

Page 23: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing conflicted?

(define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define my-priority (priority x)) (define (higher-priority? op) (define pri (priority op)) (cond ((= pri my-priority)

(error "Conflict detected between actions of equal priority“

me op)) (else

(> pri my-priority)))) …))

Page 24: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Computing conflicted?

(define (make-conflict-set-gatherer me) (lambda (ignore signal-list) … (define (conflictor? desirable?-sig) (define op (operator desirable?-sig)) (and (not (eq? op me))

(steals-my-resource? op) (higher-priority? op)))

;; Now really compute the list of inputs (filter conflictor? signal-list))

Page 25: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

A pathetic example

(define-operator make-happy 'running-make-happy (achieves happy?) (preconditions (maintain content?)) (requires-resources a))

(define-operator make-happy2 'running-make-happy2 (achieves happy2?) (serial-preconditions not-depressed? content?) (requires-resources a b) (priority 5))

(define-operator make-content 'running-make-content (achieves content?) (clobbers happy?) (priority 0))

Page 26: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

A pathetic example

(define-operator make-not-depressed 'running-make-not-depressed (achieves not-depressed?))

(define-signal doit (behavior-or make-happy make-happy2 make-content make-not-depressed))

Page 27: Means-ends analysis Northwestern University CS 395 Behavior-Based Robotics Ian Horswill

Compiled code> (compile doit)(begin (define (run) (update-grl-time!) (let* ((desirable?-make-happy #f) …) (while #t (update-grl-time!) (before-signal-update) (set! desirable?-make-happy (and want-happy? (not happy?))) (set! desirable?-make-happy2 (or (and want-happy2? (not happy2?)) stay-happy2?)) (set! make-happy-activation-level (and desirable?-make-happy content? (not desirable?-make-happy2))) (set! make-happy2-activation-level (and desirable?-make-happy2 not-depressed? content? (not #f))) (set! make-content-activation-level (and (or (and desirable?-make-happy2 not-depressed? (not content?)) desirable?-make-happy) (not (and want-happy? happy?)))) (set! doit-activation-level (or make-happy-activation-level make-happy2-activation-level make-content-activation-level (and desirable?-make-happy2 (not not-depressed?) (not #f)))) (set! doit-motor-vector (cond (make-happy-activation-level 'running-make-happy) (make-happy2-activation-level 'running-make-happy2) (make-content-activation-level 'running-make-content) (else 'running-make-not-depressed))) (after-signal-update)))))