TMC is an independent, primarily volunteer organization that relies on ad revenue to cover its operating costs. Please consider whitelisting TMC on your ad blocker and becoming a Supporting Member. For more info: Support TMC
  1. TMC is currently READ ONLY.
    Click here for more info.

NeurIPS paper: Causal Confusion in Imitation Learning

Discussion in 'Autopilot & Autonomous/FSD' started by shrineofchance, Feb 28, 2021.

  1. shrineofchance

    shrineofchance she/her, they/them

    Joined:
    Feb 10, 2021
    Messages:
    26
    Location:
    Canada
    Abstract:

    “Behavioral cloning reduces policy learning to supervised learning by training a discriminative model to predict expert actions given observations. Such discriminative models are non-causal: the training procedure is unaware of the causal structure of the interaction between the expert and the environment. We point out that ignoring causality is particularly damaging because of the distributional shift in imitation learning. In particular, it leads to a counter-intuitive “causal misidentification” phenomenon: access to more information can yield worse performance. We investigate how this problem arises, and propose a solution to combat it through targeted interventions—either environment interaction or expert queries—to determine the correct causal model. We show that causal misidentification occurs in several benchmark control domains as well as realistic driving settings, and validate our solution against DAgger and other baselines and ablations.”

    PDF: https://proceedings.neurips.cc/paper/2019/file/947018640bf36a2bb609d3557a285329-Paper.pdf

    Short talk by one of the authors explaining the paper:



    Long, wide-ranging interview with one of the authors:

    Advancements in Machine Learning with Sergey Levine - The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
     
  2. shrineofchance

    shrineofchance she/her, they/them

    Joined:
    Feb 10, 2021
    Messages:
    26
    Location:
    Canada
    Sergey Levine's discussion of the “Causal Confusion in Imitation Learning” paper in the TWIML AI podcast begins at 11:08.
     
  3. shrineofchance

    shrineofchance she/her, they/them

    Joined:
    Feb 10, 2021
    Messages:
    26
    Location:
    Canada
    Autonomous vehicles can learn behaviour planning from observing human driving behaviour, but a weakness in this approach is that the learning is superficial and latches onto incorrect proxies for or correlates to the correct behaviours.

    Targeted interventions provide a way for deep neural networks to correctly learn the real causal relationships between the key elements of the perceptual world.

    Understanding causality is widely believed by ML experts to be an important challenge in deploying robotic agents (such as autonomous vehicles) that act intelligently in the real world.
     
  4. Paddy3101

    Paddy3101 Member

    Joined:
    Mar 20, 2019
    Messages:
    261
    Location:
    San Diego, US
    Not disagreeing with the paper, but...

    But, behavioral cloning of experts, isn't the goal for most of AI learning, the goal is to be much better, than the average expert/driver. Needs to be better to gain acceptance. Learning to be a poor imitation of an expert, isn't really much use in the scheme of things.

    You can't achieve that, by simply learning to imitate what experts have decided was the cause/affect relationship. In that case, you will never be better than the expert, you can only approach the competency of the teacher.

    You need multiple methods of learning, of coarse.
    1. Being taught by a teacher, from existing knowledge gets you going. (Teaching)
    2. Then leaning more by watching peers/experts do what you are trying to do, knowing what the good/bad outcomes are, moves you beyond any one teacher. (Emulating Experts)
    3. Then you are out on your own, learning from experience. This is where you learn more about what good outcomes are, etc. etc. (Experience)
    When you get to #2 and especially #3, that is where LOTS of data comes in, as the data isn't black/white and the causes/effects aren't already known.

    At #2 you really aren't trying to emulate/clone the expert, you are trying to build a generalization of what you see, of all the experts. Experts aren't 100% correct 100% of the time. So don't want to emulate them, want to learn/generalize what the rules are.

    Experts also, won't be able to teach you 100% of what they know. Experts can't teach you 100% of what they know. Maybe 50% of what they know is conscious, the other 50% is experience (that they don't know why they know, they just do)

    At #3 then you are refining what you learned from watching experts, AND making your own rules up based upon the outcomes you see. Need to get into #3 if you want to exceed the capabilities of your teachers.

    Ever heard the expression "You start the game with a full pot o' luck and an empty pot o' experience... The object is to fill the pot of experience before you empty the pot of luck.". Well, that.

    That requires ingesting lots of data, and working out what's good, what's bad, and what you can get away with, for yourself. The AI is no different really. Just has the advantage of LOTS more data it can see, and can process that data/learn from it, quicker than we can.

    Much of 'experience' isn't about finding the cause/effect relationships, it's about getting advanced warning of the "butterfly effect". There is no cause/effect; BUT you can make predictions of behavior, even so, if you know the rules.
     
  5. shrineofchance

    shrineofchance she/her, they/them

    Joined:
    Feb 10, 2021
    Messages:
    26
    Location:
    Canada
    It is possible to use imitation learning to attain better performance than the humans being imitated, actually. You can do this by curating the training datasets or through reward learning like D-REX.
     
    • Informative x 1
  6. ZeApelido

    ZeApelido Active Member

    Joined:
    Jun 1, 2016
    Messages:
    2,641
    Location:
    The Peninsula, CA
    I appreciate you bringing up these relevant papers.
     
    • Like x 2

Share This Page

  • About Us

    Formed in 2006, Tesla Motors Club (TMC) was the first independent online Tesla community. Today it remains the largest and most dynamic community of Tesla enthusiasts. Learn more.
  • Do you value your experience at TMC? Consider becoming a Supporting Member of Tesla Motors Club. As a thank you for your contribution, you'll get nearly no ads in the Community and Groups sections. Additional perks are available depending on the level of contribution. Please visit the Account Upgrades page for more details.


    SUPPORT TMC