how to dynamically scale autoencoder with different input features

I am trying to use one code and create autoencoder with following configuration
hidden=c(6,6,6) as shown below.
Question:
1) How can I use the same code for multiple input dataframe such that one can have 20 features and other can have 400 and 1000 and so on. Goal is to change the hidden vector
2) What does hidden=c(6,6,6) mean ? I see that there are 3 hidden layers with 6 neurons each but where is the input, hidden, latent dimension and output, How many neurons? Do we just provide the encoding section and H2O will mimic the same for decoding ?
3) To make it dynamic can we do hidden = c(Num_input_feats, int(Num_input_feats/2), int(Num_input_feats/3), int(Num_input_feats/2), Num_input_feats)
Eg: If the data had 30 features then the complete autoencoder would be
hidden = c(30, 15, 8, 15, 30) ? or should it be
hidden = c(30,15,8)
var.dl = h2o.deeplearning(x=parms, training_frame=h2o.dat, autoencoder=TRUE, reproducible=T, seed=1234, activation="TanhWithDropout", hidden=c(6,6,6),epochs=50)
Answers
-
Hi @uvs, are you following a tutorial on H2O.ai's documentation? If yes can you provide the link?
-
No, i am not. I inherited the code in R which had 3 layers each with 6 nodes so i was not sure how encoder and decoder layer worked. In my example
If there are 30 input features then option one seems to be the right way to go instead of 2nd one (30,15,8) since the output is not same as input (30). Am i correct ?
hidden = c(30, 15, 8, 15, 30) ? or should it be
hidden = c(30,15,8)I am just dividing each hidden layer as half of previous layer and then for decoding doing the opposite so as to get the original input shape. Something similar shown here.
https://towardsdatascience.com/credit-card-fraud-detection-using-autoencoders-in-h2o-399cbb7ae4f1
Is there a better approach ? What is i have 300 features, will the methodology work
-
Any feedback ?