I have a working classifier with a dataset split in a train set (70%) and a test set (30%).
However, I'd like to implement a validation set as well (so that: 70% train, 20% validation and 10% test). The sets should be randomly chosen and the results should be averaged over 10 different assignments.
Any ideas how to do this? Below is my implementation using a train and test set only:
def classifier(samples):
    # load the datasets
    dataset = samples
    data_train, data_test, target_train, target_test = train_test_split(dataset["data"], dataset["target"], test_size=0.30, random_state=42)
    # fit a k-nearest neighbor model to the data
    model = KNeighborsClassifier()
    model.fit(data_train, target_train)
    print(model)
    # make predictions
    expected = target_test
    predicted = model.predict(data_test)
    # summarize the fit of the model
    print(metrics.classification_report(expected, predicted))
    print(metrics.confusion_matrix(expected, predicted))
				
                        
For what you're describing, you just need to use
train_test_splitwith a following split on its results.Adapting the tutorial there, start with something like this:
Then, just like there, make the initial train/test partition:
Now you just need to split the 0.9 of the train data into two more parts:
If you want 10 random train/test cv sets, repeat the last line 10 times (this will give you sets with overlap).
Alternatively, you could replace the last line with 10-fold validation (see the relevant classes).
The main point is to build the CV sets from the train part of the initial train/test partition.