GPs on Non-Euclidean Input Spaces
GPs on non-Euclidean input spaces have become more and more relevant in recent years. fvgp can be used for that purpose as long as a cvalid kernel is provided. Of course, if mean functions and noise functions are also provided, they have to operate on these non-Euclidean spaces.
In this example, we run a small GP on words. It’s a proof of concept, the results are not super relevant
#install the newest version of fvgp
#!pip install fvgp==4.2.0
import numpy as np
import matplotlib.pyplot as plt
from fvgp import GP
from dask.distributed import Client
%load_ext autoreload
%autoreload 2
#making the x_data a set will allow us to put any objects or structures into it.
x_data = [('hello'),('world'),('this'),('is'),('fvgp')]
y_data = np.array([2.,1.9,1.8,3.0,5.])
from fvgp.gp_kernels import *
def string_distance(string1, string2):
difference = abs(len(string1) - len(string2))
common_length = min(len(string1),len(string2))
string1 = string1[0:common_length]
string2 = string2[0:common_length]
for i in range(len(string1)):
if string1[i] != string2[i]:
difference += 1.
return difference
def kernel(x1,x2,hps,obj):
d = np.zeros((len(x1),len(x2)))
count1 = 0
for string1 in x1:
count2 = 0
for string2 in x2:
d[count1,count2] = string_distance(string1,string2)
count2 += 1
count1 += 1
return hps[0] * matern_kernel_diff1(d,hps[1])
my_gp = GP(x_data,y_data,init_hyperparameters=np.ones((2)), gp_kernel_function=kernel, info = False)
bounds = np.array([[0.001,100.],[0.001,100]])
my_gp.train(hyperparameter_bounds=bounds)
print("hyperparameters: ", my_gp.get_hyperparameters())
print("prediction : ",my_gp.posterior_mean(['full'])["f(x)"])
print("uncertainty: ",np.sqrt(my_gp.posterior_covariance(['full'])["v(x)"]))
hyperparameters: [1.43431191 0.40010055]
prediction : [2.74004694]
uncertainty: [1.19762762]
/home/marcus/Coding/fvGP/fvgp/gp.py:261: UserWarning: hyperparameter_bounds not provided. They will have to be provided in the training call.
warnings.warn("hyperparameter_bounds not provided. "
/home/marcus/Coding/fvGP/fvgp/gp.py:301: UserWarning: No noise function or measurement noise provided. Noise variances will be set to 1% of mean(y_data).
self.likelihood = GPlikelihood(self.data.x_data,