Click here to load reader

Short-Term Conflict Resolution for Unmanned Aircraft Traffic Management

  • View
    344

  • Download
    3

Embed Size (px)

Text of Short-Term Conflict Resolution for Unmanned Aircraft Traffic Management

  • Short-Term Conflict Resolution forUnmanned Aircraft Traffic Management

    Hao Yi Ong and Mykel J. Kochenderfer

    Stanford University

    September 22, 2015

  • Outline

    Introduction

    Mathematical formulation

    Approach

    Numerical experiments

    UTM-: A distributed framework

    Conclusion and future work

    Introduction 2

  • Unmanned aircraft traffic management

    I automating conflict avoidance is critical to integrating smallunmanned aerial systems (UAS) to civil airspace

    I automation to augment controllers

    Introduction 3

  • Conflict avoidance

    given potential conflicts between drones, find the best set of advisories

    I focus on horizontal or co-altitude conflict resolution

    NASAs proposed airspace for UAS is under 150 m (500 ft) flight altitudes need to avoid disruption and buildings

    I conflict defined as loss of minimum separation between aircraft

    Introduction 4

  • Complications

    I multiagent problem

    system must coordinate between many aircraft large search space for solution

    I uncertainty in system

    imperfect sensor measurements of current state variable pilot response, vehicle performance, etc. affect future path

    I appropriate trade-off between safety and efficiency not obvious

    Introduction 5

  • Goals

    robust and efficient method

    I tractable approach to complex stochastic problem

    account for uncertainty in large problem with reasonable computeresources

    I balances between airspace safety and efficiency

    ensure safety and provide timely conflict alerts to aircraft

    I arbitrary-scale optimization

    real-time conflict resolution for large airspace

    Introduction 6

  • Previous approaches

    I mathematical programming (MIP, SCP) [SVFH05, ASD12]

    can work well for simple vehicle networks

    I distributed convex optimization [OG15]

    hard to incorporate stochastic objectives/constraints

    I Markov decision process formulation (ACAS X) [KC11, CK12]

    assumes white noise accelerations for intruder aircraft

    I and many more. . . [KY00]

    Introduction 7

  • Overview

    idea: decomposition + coordination

    multi-agent pairwise joint policy

    decompose pairwise solutions

    + coordination

    =

    ac1: left ac2: right ac3: straight

    Introduction 8

  • Conflict Resolution as a UTM service

    UTM server client side

    resolution server

    filtered sector status updates

    advisories

    advisories

    status updates

    Introduction 9

  • Conflict Resolution as a Standalone service

    I derivative UTM client service

    subscribes to UTM server to track aircraft and deconflict routes pub-sub system that operators subscribe to to receive advisories

    I current implementation uses standalone model

    still unclear about end architecture for resolution service clean approach to get started with prototype system implemented with scalability and modularity in mind

    Introduction 10

  • Outline

    Introduction

    Mathematical formulation

    Approach

    Numerical experiments

    UTM-: A distributed framework

    Conclusion and future work

    Mathematical formulation 11

  • Markov decision process (MDP)

    defined by the tuple (S,A, T,R)I S and A are the sets of all possible states and actions, respectively

    I T (s, a, s) gives the probability of transitioning into state s bytaking action a at the current state s

    I R (s, a) gives the reward for taking action a at the current state s

    environment T(s,a,s )!

    agent

    action a!

    state s!

    reward R(s,a)!

    Mathematical formulation 12

  • Value iteration

    agent (s )"

    action a"

    state s"

    I want to find the optimal policy ? (s)

    gives action that maximizes the utility Q? (s, a) from any given state

    ? (s) = argmaxaA

    Q? (s, a)

    I value iteration updates value function guess Q until convergence

    expensive one-off compute but cheap policy extraction (Q lookup)

    Q (s, a) := R (s, a) +sS

    T (s, a, s)maxaA

    Q (s, a)

    Mathematical formulation 13

  • Multiagent MDP

    extension of MDP to cooperative multiagent setting

    I similar to case where centralized planner has access to system state

    I also defined by the tuple (S,A, T,R), except

    S and A are all possible joint states and actions T and R operate on elements of S and A

    Mathematical formulation 14

  • Short-term conflict resolution

    I horizontal conflict resolution between n aircraft

    conflict defined as loss of minimum separation distance, 500 m aircraft at risk if it could experience a conflict within two minutes

    I alert aircraft that need corrective maneuvers

    but not to the extent that alerts become a nuisance

    Mathematical formulation 15

  • Resolution advisories

    I at each time step T = 5 s, system issues a joint advisory

    joint advisory chosen out of a finite set of corrective bank anglesfor each aircraft

    I action set A = {20,10, 0, 10, 20,COC}n

    positive and negative angles correspond to left and right banks COC is a clear-of-conflict status advisorynothing needs to be done

    I higher resolution comes at the cost of computational complexity

    Mathematical formulation 16

  • Dynamics

    I ith aircraft state si = (xi, yi, i, vi, pi)

    latitude and longitude xi and yi heading i groundspeed vi responsiveness indicator pi

    I aircraft follow Dubins kinematic model

    simplicity avoids risk of overfitting complicated models

    xi = vi cosi yi = vi sini i =g tanivi

    i!

    bank: i!

    (xi , yi )!vi!

    Mathematical formulation 17

  • Pilot model

    I advisory response determined stochastically by Bernoulli process

    I 5 s average response delay to first corrective advisory asrecommended by ICAO [ICA07]

    when responding, the pilot executes the advisory for 5 s time step

    non-responsive: white noise path

    responsive: corrective action

    COC 5 s average response lag

    Mathematical formulation 18

  • Reward function

    balance competing objectives

    I maximize safety

    loss of separation

    cost

    separation

    closeness penalty

    I minimize disruption

    cost

    advisory COC 0 10 20

    conflict penalty

    Mathematical formulation 19

  • Outline

    Introduction

    Mathematical formulation

    Approach

    Numerical experiments

    UTM-: A distributed framework

    Conclusion and future work

    Approach 20

  • Curse of dimensionality

    I in an MMDP, S and A are all possible joint states and actions

    I problem: S and A scale exponentially with number of agents

    3 drones: 1000 states

    2 drones: 100 states

    1 drone: 10 states

    Approach 21

  • Decompose and coordinate

    multi-agent pairwise joint policy

    decompose pairwise solutions

    + coordination

    =

    ac1: left ac2: right ac3: straight

    Approach 22

  • MMDP decomposition

    multi-agent pairwise joint policy

    decompose

    ac1: left ac2: right ac3: straight

    pairwise solutions +

    coordination =

    Approach 23

  • Pairwise conflict resolution

    decompose full MMDP into O(n2)

    pairwise encounters

    I action set, pilot response model, and reward function unchanged

    I use relative positions and bearing to reduce state space

    arbitrarily set one aircraft as ownship, and the other as intruder speeds remain in global reference frame

    s =(xrel, yrel, rel, vownship, vintruder, pownship, pintruder

    )

    (0, 0 )!

    v ownship!

    rel

    (x rel , y rel )!

    v intruder

    Approach 24

  • Discretization

    I in general, no analytical solution to problem with continuous statespace

    I approximate value function Q over discretized state space

    discretize state space via multilinear interpolation iteratively update Q until convergence

    Q (s, a) := R (s, a) +sS

    T (s, a, s)maxaA

    Q (s, a)

    Approach 25

  • Approximating environment noise

    sigma-point sampling

    speed bank

    weights

    sigma-point sample

    Approach 26

  • Implementation

    I discretize continuous state space into 9.6 million states

    expensive but one-off computation

    I Q converges in 7 hours (40 iterations over S A)

    solver: parallel implementation in Julia machine: 20 2.3 GHz Intel Xeon cores with 125 GB RAM

    Variable Minimum Maximum Number of values

    xrel, yrel 3000 m 3000 m 51rel 0 360 37vownship, vintruder 10 m s1 20 m s1 5pownship, pintruder - - 2

    Approach 27

  • Pairwise policy visualization

    passing intrusion (240 bearing)

    COC

    I

    I2,0001,000 0 1,000 2,000

    2,000

    1,000

    0

    1,000

    2,000

    x (m)

    y(m

    )

    Ownship action

    COC

    I

    I

    2,0001,000 0 1,000 2,0002,000

    1,000

    0

    1,000

    2,000

    x (m)

    Intruder action

    20

    10

    0

    10

    20

    Approach 28

  • Example advisory execution

    passing intrusion at (1000 m, 0 m)

    ownship: 10 left bank intruder: 10 left bank

    Approach 29

  • Accounting for pilot response

    pilots responsive (top) and non-responsive (bottom)

    COC

    I

    I

    2,0001,000 0 1,000 2,0002,000

    1,000

    0

    1,000

    2,000

    x (m)

    y(m

    )Ownship action

    COC

Search related