Perturbation confusion in forward automatic differentiation of higher-order functions (ICFP 2020 - Program) - ICFP 2020

Write a Blog >>

Thu 20 - Fri 28 August 2020

Who

Oleksandr Manzyuk, Barak A. Pearlmutter, Alexey Radul, David Rush, Jeffrey Mark Siskind

Track

ICFP 2020 ICFP Program

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Tue 25 Aug 2020 12:26 - 12:37 at ICFP NY 3 - New York 3 (JFP talks) Chair(s): Jeremy Gibbons
Tue 25 Aug 2020 23:26 - 23:37 at ICFP Asia 3 - Asia 3 (JFP talks) Chair(s): Jeremy Gibbons

Abstract

Automatic differentiation (AD) is a technique for augmenting computer programs to compute derivatives. The essence of AD in its forward accumulation mode is to attach perturbations to each number, and propagate these through the computation by overloading the arithmetic operators. When derivatives are nested, the distinct derivative calculations, and their associated perturbations, must be distinguished. This is typically accomplished by creating a unique tag for each derivative calculation and tagging the perturbations. We exhibit a subtle bug, present in fielded implementations which support derivatives of higher-order functions, in which perturbations are confused despite the tagging machinery, leading to incorrect results. The essence of the bug is as follows: a unique tag is needed for each derivative calculation, but in existing implementations unique tags are created when taking the derivative of a function at a point. When taking derivatives of higher-order functions, these need not correspond! We exhibit a simple example: a higher-order function f whose derivative at a point x, namely f′(x), is itself a function which calculates a derivative. This situation arises naturally when taking derivatives of curried functions. Two potential solutions are presented, and their deficiencies discussed. One uses eta expansion to delay the creation of fresh tags in order to put them into one-to-one correspondence with derivative calculations. The other wraps outputs of derivative operators with tag substitution machinery. Both solutions seem very difficult to implement without violating the desirable complexity guarantees of forward AD.

DOI

https://doi.org/10.1017/S095679681900008X

Oleksandr Manzyuk

Barak A. Pearlmutter

Maynooth University

Ireland

Alexey Radul

David Rush

Jeffrey Mark Siskind

School of Electrical and Computer Engineering, Purdue University

United States

YouTube

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Tue 25 Aug
Displayed time zone: Eastern Time (US & Canada) change

	11:30 - 13:00	New York 3 (JFP talks)ICFP Program at ICFP NY 3 Chair(s): Jeremy Gibbons Department of Computer Science, University of Oxford Public livestreams: YouTube, Bilibili (China)

	11:30 11m Talk		A theory of RPC calculi for client–server modelJFP ICFP Program Kwanghoon Choi , Byeong-Mo Chang DOI Media Attached
	11:41 11m Talk		The full-reducing Krivine abstract machine KN simulates pure normal-order reduction in lockstep: A proof via corresponding calculusJFP ICFP Program Álvaro García Perez IMDEA Software Institute, Pablo Nogueira ESNE University School of Design, Innovation and Technology DOI Media Attached
	11:52 11m Talk		Local algebraic effect theoriesJFP ICFP Program Žiga Lukšič , Matija Pretnar University of Ljubljana, Slovenia DOI Media Attached
	12:03 11m Talk		Heterogeneous binary random-access listsJFP ICFP Program Wouter Swierstra Utrecht University, Netherlands DOI Media Attached
	12:15 11m Talk		POPLMark reloaded: Mechanizing proofs by logical relationsJFP ICFP Program Andreas Abel Gothenburg University, Guillaume Allais University of St Andrews, Aliya Hameer McGill University, Brigitte Pientka McGill University, Alberto Momigliano Università degli Studi di Milano, Steven Schäfer Google, Aarhus, Kathrin Stark Princeton University, USA DOI Media Attached
	12:26 11m Talk		Perturbation confusion in forward automatic differentiation of higher-order functionsJFP ICFP Program Oleksandr Manzyuk , Barak A. Pearlmutter Maynooth University, Alexey Radul , David Rush , Jeffrey Mark Siskind School of Electrical and Computer Engineering, Purdue University DOI Media Attached
	12:37 11m Talk		Elastic Sheet-Defined Functions: Generalising Spreadsheet Functions to Variable-Size Input ArraysJFP ICFP Program Matt McCutchen , Judith Borghouts , Andrew D. Gordon Microsoft Research and University of Edinburgh, Simon Peyton Jones Microsoft, UK, Advait Sarkar Microsoft Research and University of Cambridge DOI Pre-print Media Attached
	12:48 11m Talk		Emerging languages: An alternative approach to teaching programming languagesJFP ICFP Program Saverio Perugini Ave Maria University DOI Media Attached

	22:30 - 00:00	Asia 3 (JFP talks)ICFP Program at ICFP Asia 3 Chair(s): Jeremy Gibbons Department of Computer Science, University of Oxford Public livestreams: YouTube, Bilibili (China)

	22:30 11m Talk		A theory of RPC calculi for client–server modelJFP ICFP Program Kwanghoon Choi , Byeong-Mo Chang DOI Media Attached
	22:41 11m Talk		The full-reducing Krivine abstract machine KN simulates pure normal-order reduction in lockstep: A proof via corresponding calculusJFP ICFP Program Álvaro García Perez IMDEA Software Institute, Pablo Nogueira ESNE University School of Design, Innovation and Technology DOI Media Attached
	22:52 11m Talk		Local algebraic effect theoriesJFP ICFP Program Žiga Lukšič , Matija Pretnar University of Ljubljana, Slovenia DOI Media Attached
	23:03 11m Talk		Heterogeneous binary random-access listsJFP ICFP Program Wouter Swierstra Utrecht University, Netherlands DOI Media Attached
	23:15 11m Talk		POPLMark reloaded: Mechanizing proofs by logical relationsJFP ICFP Program Andreas Abel Gothenburg University, Guillaume Allais University of St Andrews, Aliya Hameer McGill University, Brigitte Pientka McGill University, Alberto Momigliano Università degli Studi di Milano, Steven Schäfer Google, Aarhus, Kathrin Stark Princeton University, USA DOI Media Attached
	23:26 11m Talk		Perturbation confusion in forward automatic differentiation of higher-order functionsJFP ICFP Program Oleksandr Manzyuk , Barak A. Pearlmutter Maynooth University, Alexey Radul , David Rush , Jeffrey Mark Siskind School of Electrical and Computer Engineering, Purdue University DOI Media Attached
	23:37 11m Talk		Elastic Sheet-Defined Functions: Generalising Spreadsheet Functions to Variable-Size Input ArraysJFP ICFP Program Matt McCutchen , Judith Borghouts , Andrew D. Gordon Microsoft Research and University of Edinburgh, Simon Peyton Jones Microsoft, UK, Advait Sarkar Microsoft Research and University of Cambridge DOI Pre-print Media Attached
	23:48 11m Talk		Emerging languages: An alternative approach to teaching programming languagesJFP ICFP Program Saverio Perugini Ave Maria University DOI Media Attached