24
84 952

Reinforcement Learning 9: Model-based methods

45:46

Reinforcement Learning 10: Extended methods

52:59

Reinforcement Learning 8: Policy gradient methods

49:43

Reinforcement Learning 6: Temporal-difference methods

41:21

Reinforcement Learning 4: Dynamic programming

44:45

Reinforcement Learning 7: Function approximation

49:40

Deep Learning 2: Mathematical principles and backpropagation

Slides: cwkx.github.io/data/teaching/dl-and-rl/dl-lecture2.pdf
Colab: colab.research.google.com/gist/cwkx/dfa207c8ceed5999bdad1ec6f637dd47/distributions.ipynb
Twitter: cwkx
Next video: ua-cam.com/play/PLMsTLcO6etti_SObSLvk9ZNvoS_0yia57.html
Foundational Statistics
- probability density function
- joint probability density function
- marginal and conditional probability
- expected values
Foundational calculus
- derivative of a function
- rules of differentiation
- partial derivative of a function
- rules of partial differentiation
- the Jacobian matrix
Mathematics of neural networks
- neural network functions
- computational graphs
- reverse mode of differentiation
#statistics #calculus #probability #deeplearning #jointprobability #marginal #conditional #derivatives #partialderivative #jacobian #neuralnetworks #computationalgraphs

Відео

Reinforcement Learning 9: Model-based methods

45:46

Reinforcement Learning 9: Model-based methods

Переглядів 2,5 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture9.pdf Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Model-based RL - taxonomy - overview - the simulation cycle - characteristics Integrated learning and planning - Dyna-Q - characteristics - Monte Carlo tree search - simulated policy learning #reinforcementlearning #modelbased #MCTS #planni...

Reinforcement Learning 10: Extended methods

52:59

Reinforcement Learning 10: Extended methods

Переглядів 8873 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture10.pdf Atari: ua-cam.com/play/PL34t13IwtOXUNliyyJtoamekLAbqhB9Il.html Twitter: cwkx Playlist: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Distributed and recurrent RL - DQN characteristics - recurrent replay in distributed RL - R2D2 performance Exploration vs exploitation - approaches Intrinsic rewards - NGU: intri...

Reinforcement Learning 8: Policy gradient methods

49:43

Reinforcement Learning 8: Policy gradient methods

Переглядів 1,4 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture8.pdf Code: github.com/higgsfield/RL-Adventure-2 Theory: lilianweng.github.io/lil-log/2018/04/08/policy-gradient-algorithms.html Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Policy-based methods - definition - characteristics - deterministic vs stochastic policies Policy gradients - gradien...

Reinforcement Learning 6: Temporal-difference methods

41:21

Reinforcement Learning 6: Temporal-difference methods

Переглядів 4,4 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture6.pdf Colab: colab.research.google.com/gist/cwkx/54e2e6d59918a083e47f19404fe275b4/temporal-difference-learning.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Temporal-difference learning - dopamine and reward predictor error - definition - behaviour example SARSA (on-policy TD control) ...

Reinforcement Learning 4: Dynamic programming

44:45

Reinforcement Learning 4: Dynamic programming

Переглядів 8 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture4.pdf Colab: colab.research.google.com/gist/cwkx/670c8d44a9a342355a4a883c498dbc9d/dynamic-programming.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Introduction - definition - examples - planning in an MDP Policy evaluation - definition - synchronous algorithm Policy iteration - policy...

Reinforcement Learning 7: Function approximation

49:40

Reinforcement Learning 7: Function approximation

Переглядів 3,6 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture7.pdf Code: github.com/higgsfield/RL-Adventure Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Function approximation - introduction - definition - challenges Incremental methods - SGD for prediction - SGD for control - convergence Batch learning - experience replay - model freezing with doubl...

Reinforcement Learning 5: Monte Carlo methods

41:06

Reinforcement Learning 5: Monte Carlo methods

Переглядів 4 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture5.pdf Colab: colab.research.google.com/gist/cwkx/a5129e8888562d1b4ecb0da611c58ce8/monte-carlo-methods.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Introduction - history of Monte Carlo methods - definition Monte Carlo prediction - overview - definition - incremental means - prediction...

1:23:17

Reinforcement Learning 3: OpenAI gym

Переглядів 8 тис.3 роки тому

Guest lecture by Adam Leach Colab: colab.research.google.com/gist/qazwsxal/6cc1c5cf16a23ae6ea8d5c369828fa80/gym-demo.ipynb The last 20mins of this video can be skipped by most watchers (contains specifics of how to use one of Durham's GPU servers. Feel free to watch if interested, otherwise just use Colab) Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Content: OpenAI gym C...

Reinforcement Learning 2: Markov Decision Processes

54:04

Reinforcement Learning 2: Markov Decision Processes

Переглядів 8 тис.3 роки тому

This lecture uses the excellent MDP example from David Silver. Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture2.pdf Colab: colab.research.google.com/gist/cwkx/ba6c44031137575d2445901ee90454da/mrp.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Content: Markov Chains - markov property - state transition matrix - definition and example...

51:34

Reinforcement Learning 1: Foundations

Переглядів 6 тис.3 роки тому

This is based on David Silver's course but targeting younger students within a shorter 50min format (missing the advanced derivations) more examples and Colab code. Slides: cwkx.github.io/data/teaching/dl-and-rl/rl-lecture1.pdf Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6ettgmyLVrcPvFLYi2Rs-R4JOE.html Introduction - definition - examples - comparison A Brief History - learnin...

Deep Learning 10: Meta learning and manifold learning

1:01:41

Deep Learning 10: Meta learning and manifold learning

Переглядів 8973 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/dl-lecture10.pdf Twitter: cwkx Playlist: ua-cam.com/play/PLMsTLcO6etti_SObSLvk9ZNvoS_0yia57.html Manifold learning - NLDR with DNNs - t-SNE and UMAP on DNNs - designing tailored embeddings - Jonker-Volgenant assignment Meta learning - thinking in distributions - the distribution of all data... - ...and of all tasks - definition - the me...

Deep Learning 9: Flow models and implicit networks

52:51

Deep Learning 9: Flow models and implicit networks

Переглядів 9183 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/dl-lecture9.pdf GON: cwkx.github.io/data/GON/ SIREN: vsitzmann.github.io/siren/ Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6etti_SObSLvk9ZNvoS_0yia57.html Flow models - definition - the determinant - the change of variables theorem Normalising flows - definition - triangular Jacobians - normalising flow layers Implicit representa...

53:24

Deep Learning 7: Energy-based models

Переглядів 8 тис.3 роки тому

Slides: cwkx.github.io/data/teaching/dl-and-rl/dl-lecture7.pdf Colab: colab.research.google.com/gist/cwkx/6b2d802e804e908a3ee3d58c1e0e73be/dbm.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6etti_SObSLvk9ZNvoS_0yia57.html Manifolds Energy-based models - definition - GANs as energy-based models - clustering as an energy-based model - softmax and softmin - exact likelihood Co...

Deep Learning 3: PyTorch programming (coding session)

1:01:57

Deep Learning 3: PyTorch programming (coding session)

Переглядів 2 тис.3 роки тому

This video has minor problems with audio but fixes itself later Colab 1: colab.research.google.com/gist/cwkx/441e508d3b904413fd3950a09a1d3bd6/classifier.ipynb Colab 2: colab.research.google.com/gist/cwkx/3a6eba039aa9f68d0b9d37a02216d385/convnet.ipynb Twitter: cwkx Next video: ua-cam.com/play/PLMsTLcO6etti_SObSLvk9ZNvoS_0yia57.html #PyTorch #programming #livecoding #deeplearning #vis...

53:39

Deep Learning 8: Sequential models

Переглядів 6873 роки тому

Deep Learning 8: Sequential models

48:30

Deep Learning 6: Adversarial models

Переглядів 7583 роки тому

Deep Learning 6: Adversarial models

47:21

Deep Learning 5: Generative models

Переглядів 2 тис.3 роки тому

Deep Learning 5: Generative models

Deep Learning 4: Designing Models to Generalise

55:50

Deep Learning 4: Designing Models to Generalise

Переглядів 1,3 тис.3 роки тому

Deep Learning 4: Designing Models to Generalise

57:12

Deep Learning 1: Introduction

Переглядів 8 тис.3 роки тому

Deep Learning 1: Introduction

2:45

Gradient Origin Networks (GONs)

Переглядів 9373 роки тому

Gradient Origin Networks (GONs)

Interactive GPU active contours for segmenting inhomogeneous objects

4:41

Interactive GPU active contours for segmenting inhomogeneous objects

Переглядів 4606 років тому

Interactive GPU active contours for segmenting inhomogeneous objects

2:58

ABI: Automatic 3D Billboard Imposters

Переглядів 9 тис.10 років тому

ABI: Automatic 3D Billboard Imposters

3:48

Starship Physics

Переглядів 47110 років тому

Starship Physics

КОМЕНТАРІ

@mk_upo Місяць тому
Great! Thanks
@BlueBirdgg 2 місяці тому
Interesting example on 14:00. Watched your 2 series. Ty very much!
@monkeysareaproblem1743 2 місяці тому
looks like the colab code is outdated step # setup the environment, and record a video every 50 episodes gives: AttributeError: module 'gym.wrappers' has no attribute 'Monitor' also a lot of deprecation warnings
@BlueBirdgg 3 місяці тому
Thank you for the classes! Incredible so far!
@ichaa3tech 5 місяців тому
This is seriously very underrated, the best
@bello3137 5 місяців тому
You extract just the key things needed in sutton's book. I follow his book while following your videos and bunch others as well 😁 thank you
@SaschaRobitzki 6 місяців тому
Why is the action q* going from class3 to the pub 9.4?
@fatemehnorouzi1722 7 місяців тому
I have watched several lectures about RL and you have been the best one
@bello3137 6 місяців тому
same here.
@ichaa3tech 5 місяців тому
saaame
@mohammadnadeem839 7 місяців тому
i am the 2k th person added in your subscriber list
@lovol2 9 місяців тому
Thanks. Was good to hear there is lack of consistency in the notation used in papers as that was confusing!
@wajidiqbal5633 10 місяців тому
thank you prof, for this elegant explanation....
@compsciorbust9562 11 місяців тому
Incomplete list of timestamps: 0:00 - Introduction 3:01 - Installing Conda 13:25 - Using Torch (tensors) 23:00 - Using Torch (data parsing) 32:15 - Using backprop 52:00 (ish) - Using convolution
@ARREYAR Рік тому
The best lectures on RL
@user-ch1qs8rz2u Рік тому
it was perfect thank you so much
@xiaocenliu Рік тому
thank you so much for not assuming students knowing everything 😭. You explain it so clear!
@jrohit1110 Рік тому
Chapter 12 Sutton and Barto. This is what I was looking for. Thanks for beautiful the explanation!
@AymenSekhri-gw8wh Рік тому
Thank you so much ! rich content for free.
@AymenSekhri-gw8wh Рік тому
Good Explanation, thank you so much
@yuktikaura Рік тому
Possible for you to share the latex template for the presentation?
@ichaa3tech 5 місяців тому
riight
@chanpreetsingh007 Рік тому
thanx
@yuktikaura Рік тому
How do we conclude that ->success: {stats/episode} would always be monotonically increasing? And how does it indicate convergence with value as 0.54?
@cwkx Рік тому
Hi I can't remember what I said here, it was over 2 years ago - we go into depth on the convergence properties for this in our practicals, which are on my github: github.com/cwkx/materials/raw/main/reinforcement-learning/rl-answers4.pdf and github.com/cwkx/materials/blob/main/reinforcement-learning/rl-answers5.pdf etc
@yuktikaura Рік тому
Thanks for a well explained topic.
@yuktikaura Рік тому
Thank you for doing these videos... I found them really very helpful
@hom01 Рік тому
Amazing lecture, thanks for uploading
@maryamfarajzadeh2262 Рік тому
I really recommend this video! Just perfect! I wish POMDP was also well explained in the same way!
@GeneralKenobi69420 Рік тому
sassy 💅
@danielji5184 Рік тому
good lecture
@agustinrovira2955 Рік тому
Great job !
@benmiss1767 Рік тому
Very very useful and informative thank you very much!
@rladndud1722 Рік тому
35:02 implicit networks
@alebadi Рік тому
You started not knowing what you are talking about.
@phenixzhang4224 2 роки тому
As a newcomer who has already learned pytoch, this class has allowed me to consolidate the actual combat practice of torch and understand the magic of vision. In addition, I would like to ask Willcocksr: How should we learn more about using pytoch to build our own network?
@aliheidary355 2 роки тому
could you speak louder in your next video
@zbaker0071 2 роки тому
On Langevin Dynamics in Machine Learning - Michael I. Jordan (Video Link): ua-cam.com/video/QTnjqdxG99c/v-deo.html&ab_channel=InstituteforAdvancedStudy
@bing6740 2 роки тому
The example of car accident got me
@malathreayad 2 роки тому
Excellent explanation, well done
@AhamedKabeer-wn1jb 2 роки тому
Thank you..
@user-zc8kj4sw5w 2 роки тому
By far one of the most thorough and helpful explanations I've encountered! THANKS!
@shahardagan1584 2 роки тому
Hi great lectures! i would like to know if you can recommend me more courses and resources to advance in the field
@cwkx 2 роки тому
I'd just recommend just getting into a habit of reading the latest papers from the top venues such as ICLR, NeurIPS and CVPR when they've been reviewed, e.g. find a sorted ranked list of the best papers and read the abstracts and Ctrl+F any terms interesting to you - e.g. tanelp.github.io/neurips2021/ and papers.labml.ai/papers/iclr_2022?sort_by=conference_score&dsc=0
@Manishkumar-ww4gm 2 роки тому
Very nice explanation sir.Thank you
@amirmahdikhosrvitabrizi7516 2 роки тому
It was amazing, this lecture has made my life much easier. thank you.
@Jannls 2 роки тому
Your video helped me a lot! Very informative and easy to understand. Thank you!
@joaoborges2014 2 роки тому
Amazing lecture, thank you so much!
@saharrahimimalakshan5485 2 роки тому
It was an amazing video, you explained the issues in the best way. Thank you
@phenixzhang4224 2 роки тому
Recently I followed the teacher’s second class on and off, involving some basic theoretical knowledge related to backpropagation. The main difficulty is still in the formula described in English, but I can feel the improvement of listening. I also manually offline deduced the back-propagation algorithm and implemented it in python, keep going!
@AECTechJourneys 2 роки тому
This is gem. Great content.
@johnnassour 3 роки тому
Would you please check the calculation for 0.34? It is 0.3125 in my calculation. Thank you.
@cwkx 3 роки тому
0.34375 = 0.25*0.0625 (left) + 0.25*0.0625 (up) + 0.25*0.25 (down) + 0.25*1 (right) - you can see this clearly if you go to the colab notebook in the comment, then in the Policy evaluation section, where it first says "# evaluate this policy ... V = policy_evaluation(env,policy,draw=False)" change draw=True and you'll see all the intermediate steps and you can see 0.34375.
@johnnassour 3 роки тому
@@cwkx so you use the updated valves in the same episode. I thought they will be used in the next episode.
@harrivayrynen Рік тому
@@johnnassour Yes there must be mistake in video's calculations. If you look algorithm, the array is updated when all states are gone through. But still very good video series, thanks for that.
@phenixzhang4224 3 роки тому
Deep learning is indeed developing faster and faster, which requires us to understand it in essence, including biological perspectives, historical perspectives, etc. I hope I can build a more systematic understanding of deep learning through this course. By the way, Teacher Willcocks's English expression is very fluent, and it doesn't sound particularly strenuous.
@Dian87barry 3 роки тому
Hello Dear Sir Interesting Video, Please did it possible to have the code for this video
@cwkx 3 роки тому
Hi Mamadou, all code where available are in the video descriptions (Colab links).
@Dian87barry 3 роки тому
@@cwkx thank you
@alialtan8182 3 роки тому
Hi there,big fan! I learned that you also had expertise in security, is there any chance you will consider teaching them in youtube ?
@cwkx 3 роки тому
Hi Ali, many thanks for the kind comment - unfortunately I don't think I can get permission for this due to the sensitive nature of some of the security materials/discussions/exploits/real-world stories not covered in the slides.