autodiff
autodiff, keyword, block
The autodiff keyword introduces a block intended for a stochastic gradient descent. The logic within the autodiff block is automatically differentiated. The keyword is followed by the name of the observation table. The runtime uses Adam for parameter updates.
epochs = 500
const lr = 0.1
autodiff Scalar epochs: epochs learningRate: lr with
params a auto
s = a * a
return s
show scalar "" with a // 0.00
The option epochs is optional; its default value is 10. It accepts a scalar
number, so it can be provided through a variable.
The option learningRate is optional; its default value is 0.01. It requires a
constant scalar value, but you can assign it through a const variable.
The observation table can be Scalar. Unlike each, autodiff does not accept a scan option.
All params must belong to small tables. The return statement must provide at least one scalar loss; extra return values are treated as metrics and can be named with name: value.
Zedfunc values can be used inside an autodiff block (for example via valueAt), but parameters cannot be used to build a zedfunc (linear(a) yields no gradient).
The gradient mode
The gradient mode is intended to debug a strange behavior in an autodiff block. It returns the gradient of the parameters (not the update!) during one and only one epoch. When a parameter is utilized multiple times within an epoch, such as when it originates from a Scalar/Upstream table, the gradient is accumulated across these occurrences, and the sum of the gradients is returned.
autodiff Scalar mode:"gradient" with
params X auto(1,0)
Loss = X * X
return Loss
show scalar "d(Loss)/d(X) = X * X" with X
In the script above, the parameter is not updated but stores the computed gradient through the one and only one “epoch”.
Lookups between full tables are supported inside autodiff blocks, and the
gradient flows back to the looked-up vector.
table F1[kp1] = with
[| as kp1, as k2, as X |]
[| "a", "1", 1.0 |]
[| "b", "3", 2.0 |]
[| "c", "3", 3.0 |]
[| "d", "2", 4.0 |]
table F2[kp2] = with
[| as kp2, as mu |]
[| "1", 1.0 |]
[| "2", 2.0 |]
[| "3", 3.0 |]
autodiff Scalar mode:"gradient" with
params F2.mu
F1.mu = F2.mu[F1.k2]
loss = sum(F1.mu * F1.X)
return loss
show table "Gradient" with
F2.kp2
F2.mu
This outputs the following table:
| kp2 | mu |
|---|---|
| 1 | 1 |
| 2 | 4 |
| 3 | 5 |
The parallel mode
The execution of the autodiff block can be automatically parallelized. This execution mode is intended for larger datasets. This parallelization typically incurs a modest decrease of convergence speed - counted in epochs - in exchange for a faster wall-clock execution.
autodiff Scalar epochs: 500 learningRate: 0.1 mode: "parallel" with
params a auto
s = a * a
return s
show scalar "" with a // 0.00
In the script above, the mode option is set to parallel.
The batch mode
For datasets with high dissimilarities, batch processing may stabilize the learning process. It updates the model’s parameters using a batch of observations instead of an individual observation. This method may require more iterations to converge but is useful when some observations differ significantly from the others.
table Obs = extend.range(10000)
lambda = 10
Obs.Y = random.poisson(lambda into Obs)
autodiff Obs batch:100 with
params lambdaEstimation auto(9.5 ,0)
return -loglikelihood.poisson(lambdaEstimation, Obs.Y)
show summary "Regressed Poisson distribution (batch)" with lambdaEstimation // 10.02
autodiff Obs with
params lambdaEstimation auto(9.5 ,0)
return -loglikelihood.poisson(lambdaEstimation, Obs.Y)
show summary "Regressed Poisson distribution (no batch)" with lambdaEstimation // 9.90
In the script above, we estimate the poisson parameter. The autodiff block with batch mode is more accurate than the classical one.
The validation mode
The validation mode is intended to monitor overfitting. It splits the dataset into two parts: training and validation. The training data is used as usual, while the validation data is not used for updating the block’s parameters. Validation can be applied only if parameters are located in upstream, full, or upstream-cross tables. If a parameter is in the observation table, the parameter lines in the validation set will not be updated. Validation curves are displayed in the autodiff metrics.
table Observations = extend.range(50)
Observations.Cat = random.integer(5)
Observations.Y = Observations.Cat + random.normal(0, 0.1)
table Categories[Cat] = by Observations.Cat
Observations.IsTest = random.binomial(0.5 into Observations)
// The validation option expects either
// - a scalar number between 0 and 1
// - a boolean vector in the observation table wich indicates
// if the observation is concerned by the validation set
autodiff Observations validation:Observations.IsTest with
params Categories.Alpha auto
Loss = (Categories.Alpha - Observations.Y)^2
return Loss
show table "Validation from boolean vector" with Categories.Alpha
autodiff Observations validation:0.5 with
params Categories.Alpha auto
Loss = (Categories.Alpha - Observations.Y)^2
return Loss
show table "Validation from scalar number" with Categories.Alpha
The validation curves can be found in the autodiff metrics of the run.
autodiff, keyword, pure function option
The autodiff keyword indicates that the pure function can be executed inside an autodiff block
def autodiff pure mySquare(x: number) with
return x * x
autodiff Scalar epochs: 500 with
params a auto
return mySquare(a)
show scalar "" with a // 0.00
In the present reference documentation, pure functions that are part of the Envision standard library and that can be executed inside an autodiff block are marked as autodiff.