Calculating the Correct/Fastest Cost Matrix for solving TSP for a large dataset [on hold]
Link to the code/idea that is used here - Extracting an adjaceny matrix containing haversine distance from points on map
Note: This is a question that is based on the link above, where the cost matrix is calculated using combinations_with_replacement
, which is not giving me the correct results.
What is the purpose of the code below :
Square Cost/Distance Matrix or Adjacent Matrix of points, each point constitutes of coordinates, The reason I am writing the code is to solve the TSP for nearly 200k points to get the optimal solution, I need to get the cost matrix and then apply the 2-opt or 3-opt algorithm on top of it
So let's say as below if we have 10 points our cost matrix will be a 10x10 matrix with 0's on the diagonals. I have to use this cost matrix later to solve a TSP for the coordinates given.
Function calls for both the functions and the Complete code is given below
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
Local TSP Solver- Implementing the 2-opt method to solve a TSP.
'''
import sys, os
import pandas as pd
from glob import glob
import numpy as np
from itertools import combinations_with_replacement, combinations, permutations
from haversine import haversine
import time
class Two_opt(object):
'''Using the 2-opt algoirthm to improve on the TSP solver'''
def __init__(self,arg):
'''Return the global variable cities as a numpy array.'''
self.path = arg
for f in glob(self.path+"/*.csv", recursive=True):
if 'cities' in f:
self.cities = pd.read_csv(f, usecols = ['X','Y']).values
def saving_adj_mat_method2(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(combinations_with_replacement(points,2)):
adj_mat[i]=haversine(point1, point2)
# for i in range(10):
# adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
# adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def saving_adj_mat_method1(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(permutations(points,2)):
adj_mat[i]=haversine(point1, point2)
for i in range(10):
adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def adjmat(self):
'''Function to calculate the adj Matrix/Cost matrix.'''
adj_mat1 = np.empty(10**2)
adj_mat2 = np.empty(20000**2)
adj_mat3 = np.empty(20000**2)
adj_mat4 = np.empty(20000**2)
adj_mat5 = np.empty(20000**2)
adj_mat6 = np.empty(20000**2)
adj_mat7 = np.empty(20000**2)
adj_mat8 = np.empty(20000**2)
adj_mat9 = np.empty(20000**2)
adj_mat10 = np.empty((len(self.cities)-(9*20000))**2)
#print(self.cities[:5])
print(f'Number of cities {len(self.cities)}')
# 1st 20k points
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
#2nd 20k points
adj_mat2 = self.saving_adj_mat(20000,40000)
#3rd 20k points
adj_mat3 = self.saving_adj_mat(40000,60000)
#4th 20k points
adj_mat4 = self.saving_adj_mat(60000,80000)
#5th 20k points
adj_mat5 = self.saving_adj_mat(80000,100000)
#6th 20k points
adj_mat6 = self.saving_adj_mat(100000,120000)
#7th 20k points
adj_mat7 = self.saving_adj_mat(120000,140000)
#8th 20k points
adj_mat8 = self.saving_adj_mat(140000,160000)
#9th 20k points
adj_mat9 = self.saving_adj_mat(160000,180000)
# #Last points
adj_mat10 = self.saving_adj_mat(180000,len(self.cities))
'''
#store the array
file = self.path + '/adjmat_method1.npz'
np.savez(file, adj_mat1)
file = self.path + '/adjmat_method2.npz'
np.savez(file, adj_mat1c)
def getnumpyarr(self):
'''loading the pre-calculated numpy matrices'''
datavals =
for f in glob(self.path + "/*.npz", recursive=True):
datavals.append(np.load(f))
return datavals
def getcompletemat(self, *args):
'''Get the matrix reshaped and combined to be used for the algorithm.'''
data1 = args[0]
arr1 = data1['arr_0'].reshape(10, 10)
# arr2 = data1['arr_1'].reshape(20000, 20000)
# arr3 = data1['arr_2'].reshape(20000, 20000)
# arr4 = data1['arr_3'].reshape(20000, 20000)
# arr5 = data1['arr_4'].reshape(20000, 20000)
# arr6 = data1['arr_5'].reshape(20000, 20000)
# arr7 = data1['arr_6'].reshape(20000, 20000)
# arr8 = data1['arr_7'].reshape(20000, 20000)
# arr9 = data2['arr_0'].reshape(20000, 20000)
# arr10 = data2['arr_1'].reshape((len(self.cities)-(9*20000)), (len(self.cities)-(9*20000)))
print(arr1)
if __name__ == "__main__":
wd = os.getcwd()
path = wd.split("/")
newpath = "/".join(path[:-1])
check_npz=False
for f in glob(newpath + "/*.npz", recursive=True):
check_npz=True
obj=Two_opt(newpath)
if not check_npz:obj.adjmat()
datalist = obj.getnumpyarr()
for i in range(len(datalist)):
print(f'data present n data1 - {datalist[i]} n')
print(f'keys of data1 {datalist[i].files}')
obj.getcompletemat(datalist[i])
@Graipher gave a solution on which the 1st function call is based on, Correct me if I am wrong - I could have put this in the comments but cannot as I do not have enough reputation here. With combinations_with_replacement
we will not get all combinations for a square-cost matrix? Please tell me if I am missing something here.
combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC,
It does not give us the points, (BA) and (CA, CB). So if we have 10
points, we will only get 10, 9, 8, 7, ...1 and so on unique
combinations to fill the cost matrix. That will not give the correct
cost/distance matrix
Instead, I used permutations(), to get the correct and symmetric matrix, but this might not be optimal time-complexity. I need some advice here.
Output for 1st Function call - This does not give the correct matrix, as expected, this first function uses combinations_with_replacement, this is faster but gives the wrong answer
[[ 0.00000000e+000 1.27332308e+004 6.57898227e+003 5.86284373e+003
3.87871486e+003 3.67382793e+003 9.38784305e+003 6.46237005e+003
6.51036438e+003 5.43284410e+003]
[ 0.00000000e+000 1.62585887e+004 8.60903414e+003 1.41825246e+004
1.16056316e+004 1.70335895e+004 6.30445511e+003 1.14165602e+004
1.44391026e+004 0.00000000e+000]
[ 8.21256846e+003 7.62939803e+003 9.52888602e+003 6.59919871e+003
1.16551154e+004 1.11473488e+004 8.20792339e+003 0.00000000e+000
9.70389701e+003 8.46124341e+003]
[ 1.43339885e+004 3.99278066e+003 1.13084634e+004 1.12586665e+004
0.00000000e+000 2.61777230e+003 6.51291219e+003 9.11701996e+003
3.57624326e+003 1.55720772e+003]
[ 0.00000000e+000 8.99206700e+003 6.76801760e+003 2.97848880e+003
3.57521856e+003 0.00000000e+000 1.56060528e+004 7.79162477e+003
5.44506378e+003 0.00000000e+000]
[ 8.75252369e+003 1.03348289e+004 0.00000000e+000 3.06388021e+003
0.00000000e+000 4.68834210e-310 1.08566335e+012 6.90532443e-310
4.68834210e-310 1.44020725e+001]
[ 6.90532443e-310 4.68834210e-310 1.78169586e-267 6.90532443e-310
6.90532400e-310 6.88233445e-321 4.68834280e-310 4.68834246e-310
0.00000000e+000 0.00000000e+000]
[ 2.02566915e-322 -7.59297323e+303 1.08823978e-269 -3.29118277e+304
-6.17074930e+303 4.23627391e-183 1.51575303e-207 1.11491862e-187
-1.02851904e+304 -2.02003913e-269]
[ 6.90533811e-310 6.90533890e-310 4.02018951e-197 6.90532598e-310
4.68834210e-310 -1.47320674e-196 6.90532598e-310 4.68834210e-310
7.05462116e+117 6.90532598e-310]
[ 6.90533816e-310 2.47424187e+257 6.90532720e-310 4.68834211e-310
-4.20571524e-040 6.90532554e-310 4.68834210e-310 -1.76018832e+290
6.90532554e-310 4.68834210e-310]]
Output for 2nd Function call - This gives the correct answer, it is way slower for a large number of points, so it not optimal and needs to be checked for better time complexity
[[ 0. 12733.23081175 6578.98226932 5862.84372969
3878.71485682 3673.82793216 9387.84304626 6462.37004961
6510.36438472 5432.84409789]
[12733.23081175 0. 16258.5887252 8609.03413858
14182.52456514 11605.63163523 17033.5895249 6304.4551083
11416.5602085 14439.1026007 ]
[ 6578.98226932 16258.5887252 0. 8212.56845897
7629.39802629 9528.88602263 6599.19870899 11655.11542038
11147.34880286 8207.9233913 ]
[ 5862.84372969 8609.03413858 8212.56845897 0.
9703.89700789 8461.24341497 14333.98852641 3992.7806584
11308.46341052 11258.66649267]
[ 3878.71485682 14182.52456514 7629.39802629 9703.89700789
0. 2617.77230465 6512.91218854 9117.01995685
3576.24326113 1557.20772372]
[ 3673.82793216 11605.63163523 9528.88602263 8461.24341497
2617.77230465 0. 8992.06700473 6768.01759819
2978.48879937 3575.21855563]
[ 9387.84304626 17033.5895249 6599.19870899 14333.98852641
6512.91218854 8992.06700473 0. 15606.05278166
7791.62477038 5445.06378019]
[ 6462.37004961 6304.4551083 11655.11542038 3992.7806584
9117.01995685 6768.01759819 15606.05278166 0.
8752.52368527 10334.82891513]
[ 6510.36438472 11416.5602085 11147.34880286 11308.46341052
3576.24326113 2978.48879937 7791.62477038 8752.52368527
0. 3063.88021254]
[ 5432.84409789 14439.1026007 8207.9233913 11258.66649267
1557.20772372 3575.21855563 5445.06378019 10334.82891513
3063.88021254 0. ]]
python-3.x
New contributor
put on hold as unclear what you're asking by Zeta, alecxe, Jamal♦ 20 hours ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
add a comment |
Link to the code/idea that is used here - Extracting an adjaceny matrix containing haversine distance from points on map
Note: This is a question that is based on the link above, where the cost matrix is calculated using combinations_with_replacement
, which is not giving me the correct results.
What is the purpose of the code below :
Square Cost/Distance Matrix or Adjacent Matrix of points, each point constitutes of coordinates, The reason I am writing the code is to solve the TSP for nearly 200k points to get the optimal solution, I need to get the cost matrix and then apply the 2-opt or 3-opt algorithm on top of it
So let's say as below if we have 10 points our cost matrix will be a 10x10 matrix with 0's on the diagonals. I have to use this cost matrix later to solve a TSP for the coordinates given.
Function calls for both the functions and the Complete code is given below
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
Local TSP Solver- Implementing the 2-opt method to solve a TSP.
'''
import sys, os
import pandas as pd
from glob import glob
import numpy as np
from itertools import combinations_with_replacement, combinations, permutations
from haversine import haversine
import time
class Two_opt(object):
'''Using the 2-opt algoirthm to improve on the TSP solver'''
def __init__(self,arg):
'''Return the global variable cities as a numpy array.'''
self.path = arg
for f in glob(self.path+"/*.csv", recursive=True):
if 'cities' in f:
self.cities = pd.read_csv(f, usecols = ['X','Y']).values
def saving_adj_mat_method2(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(combinations_with_replacement(points,2)):
adj_mat[i]=haversine(point1, point2)
# for i in range(10):
# adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
# adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def saving_adj_mat_method1(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(permutations(points,2)):
adj_mat[i]=haversine(point1, point2)
for i in range(10):
adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def adjmat(self):
'''Function to calculate the adj Matrix/Cost matrix.'''
adj_mat1 = np.empty(10**2)
adj_mat2 = np.empty(20000**2)
adj_mat3 = np.empty(20000**2)
adj_mat4 = np.empty(20000**2)
adj_mat5 = np.empty(20000**2)
adj_mat6 = np.empty(20000**2)
adj_mat7 = np.empty(20000**2)
adj_mat8 = np.empty(20000**2)
adj_mat9 = np.empty(20000**2)
adj_mat10 = np.empty((len(self.cities)-(9*20000))**2)
#print(self.cities[:5])
print(f'Number of cities {len(self.cities)}')
# 1st 20k points
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
#2nd 20k points
adj_mat2 = self.saving_adj_mat(20000,40000)
#3rd 20k points
adj_mat3 = self.saving_adj_mat(40000,60000)
#4th 20k points
adj_mat4 = self.saving_adj_mat(60000,80000)
#5th 20k points
adj_mat5 = self.saving_adj_mat(80000,100000)
#6th 20k points
adj_mat6 = self.saving_adj_mat(100000,120000)
#7th 20k points
adj_mat7 = self.saving_adj_mat(120000,140000)
#8th 20k points
adj_mat8 = self.saving_adj_mat(140000,160000)
#9th 20k points
adj_mat9 = self.saving_adj_mat(160000,180000)
# #Last points
adj_mat10 = self.saving_adj_mat(180000,len(self.cities))
'''
#store the array
file = self.path + '/adjmat_method1.npz'
np.savez(file, adj_mat1)
file = self.path + '/adjmat_method2.npz'
np.savez(file, adj_mat1c)
def getnumpyarr(self):
'''loading the pre-calculated numpy matrices'''
datavals =
for f in glob(self.path + "/*.npz", recursive=True):
datavals.append(np.load(f))
return datavals
def getcompletemat(self, *args):
'''Get the matrix reshaped and combined to be used for the algorithm.'''
data1 = args[0]
arr1 = data1['arr_0'].reshape(10, 10)
# arr2 = data1['arr_1'].reshape(20000, 20000)
# arr3 = data1['arr_2'].reshape(20000, 20000)
# arr4 = data1['arr_3'].reshape(20000, 20000)
# arr5 = data1['arr_4'].reshape(20000, 20000)
# arr6 = data1['arr_5'].reshape(20000, 20000)
# arr7 = data1['arr_6'].reshape(20000, 20000)
# arr8 = data1['arr_7'].reshape(20000, 20000)
# arr9 = data2['arr_0'].reshape(20000, 20000)
# arr10 = data2['arr_1'].reshape((len(self.cities)-(9*20000)), (len(self.cities)-(9*20000)))
print(arr1)
if __name__ == "__main__":
wd = os.getcwd()
path = wd.split("/")
newpath = "/".join(path[:-1])
check_npz=False
for f in glob(newpath + "/*.npz", recursive=True):
check_npz=True
obj=Two_opt(newpath)
if not check_npz:obj.adjmat()
datalist = obj.getnumpyarr()
for i in range(len(datalist)):
print(f'data present n data1 - {datalist[i]} n')
print(f'keys of data1 {datalist[i].files}')
obj.getcompletemat(datalist[i])
@Graipher gave a solution on which the 1st function call is based on, Correct me if I am wrong - I could have put this in the comments but cannot as I do not have enough reputation here. With combinations_with_replacement
we will not get all combinations for a square-cost matrix? Please tell me if I am missing something here.
combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC,
It does not give us the points, (BA) and (CA, CB). So if we have 10
points, we will only get 10, 9, 8, 7, ...1 and so on unique
combinations to fill the cost matrix. That will not give the correct
cost/distance matrix
Instead, I used permutations(), to get the correct and symmetric matrix, but this might not be optimal time-complexity. I need some advice here.
Output for 1st Function call - This does not give the correct matrix, as expected, this first function uses combinations_with_replacement, this is faster but gives the wrong answer
[[ 0.00000000e+000 1.27332308e+004 6.57898227e+003 5.86284373e+003
3.87871486e+003 3.67382793e+003 9.38784305e+003 6.46237005e+003
6.51036438e+003 5.43284410e+003]
[ 0.00000000e+000 1.62585887e+004 8.60903414e+003 1.41825246e+004
1.16056316e+004 1.70335895e+004 6.30445511e+003 1.14165602e+004
1.44391026e+004 0.00000000e+000]
[ 8.21256846e+003 7.62939803e+003 9.52888602e+003 6.59919871e+003
1.16551154e+004 1.11473488e+004 8.20792339e+003 0.00000000e+000
9.70389701e+003 8.46124341e+003]
[ 1.43339885e+004 3.99278066e+003 1.13084634e+004 1.12586665e+004
0.00000000e+000 2.61777230e+003 6.51291219e+003 9.11701996e+003
3.57624326e+003 1.55720772e+003]
[ 0.00000000e+000 8.99206700e+003 6.76801760e+003 2.97848880e+003
3.57521856e+003 0.00000000e+000 1.56060528e+004 7.79162477e+003
5.44506378e+003 0.00000000e+000]
[ 8.75252369e+003 1.03348289e+004 0.00000000e+000 3.06388021e+003
0.00000000e+000 4.68834210e-310 1.08566335e+012 6.90532443e-310
4.68834210e-310 1.44020725e+001]
[ 6.90532443e-310 4.68834210e-310 1.78169586e-267 6.90532443e-310
6.90532400e-310 6.88233445e-321 4.68834280e-310 4.68834246e-310
0.00000000e+000 0.00000000e+000]
[ 2.02566915e-322 -7.59297323e+303 1.08823978e-269 -3.29118277e+304
-6.17074930e+303 4.23627391e-183 1.51575303e-207 1.11491862e-187
-1.02851904e+304 -2.02003913e-269]
[ 6.90533811e-310 6.90533890e-310 4.02018951e-197 6.90532598e-310
4.68834210e-310 -1.47320674e-196 6.90532598e-310 4.68834210e-310
7.05462116e+117 6.90532598e-310]
[ 6.90533816e-310 2.47424187e+257 6.90532720e-310 4.68834211e-310
-4.20571524e-040 6.90532554e-310 4.68834210e-310 -1.76018832e+290
6.90532554e-310 4.68834210e-310]]
Output for 2nd Function call - This gives the correct answer, it is way slower for a large number of points, so it not optimal and needs to be checked for better time complexity
[[ 0. 12733.23081175 6578.98226932 5862.84372969
3878.71485682 3673.82793216 9387.84304626 6462.37004961
6510.36438472 5432.84409789]
[12733.23081175 0. 16258.5887252 8609.03413858
14182.52456514 11605.63163523 17033.5895249 6304.4551083
11416.5602085 14439.1026007 ]
[ 6578.98226932 16258.5887252 0. 8212.56845897
7629.39802629 9528.88602263 6599.19870899 11655.11542038
11147.34880286 8207.9233913 ]
[ 5862.84372969 8609.03413858 8212.56845897 0.
9703.89700789 8461.24341497 14333.98852641 3992.7806584
11308.46341052 11258.66649267]
[ 3878.71485682 14182.52456514 7629.39802629 9703.89700789
0. 2617.77230465 6512.91218854 9117.01995685
3576.24326113 1557.20772372]
[ 3673.82793216 11605.63163523 9528.88602263 8461.24341497
2617.77230465 0. 8992.06700473 6768.01759819
2978.48879937 3575.21855563]
[ 9387.84304626 17033.5895249 6599.19870899 14333.98852641
6512.91218854 8992.06700473 0. 15606.05278166
7791.62477038 5445.06378019]
[ 6462.37004961 6304.4551083 11655.11542038 3992.7806584
9117.01995685 6768.01759819 15606.05278166 0.
8752.52368527 10334.82891513]
[ 6510.36438472 11416.5602085 11147.34880286 11308.46341052
3576.24326113 2978.48879937 7791.62477038 8752.52368527
0. 3063.88021254]
[ 5432.84409789 14439.1026007 8207.9233913 11258.66649267
1557.20772372 3575.21855563 5445.06378019 10334.82891513
3063.88021254 0. ]]
python-3.x
New contributor
put on hold as unclear what you're asking by Zeta, alecxe, Jamal♦ 20 hours ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
2
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
2
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday
add a comment |
Link to the code/idea that is used here - Extracting an adjaceny matrix containing haversine distance from points on map
Note: This is a question that is based on the link above, where the cost matrix is calculated using combinations_with_replacement
, which is not giving me the correct results.
What is the purpose of the code below :
Square Cost/Distance Matrix or Adjacent Matrix of points, each point constitutes of coordinates, The reason I am writing the code is to solve the TSP for nearly 200k points to get the optimal solution, I need to get the cost matrix and then apply the 2-opt or 3-opt algorithm on top of it
So let's say as below if we have 10 points our cost matrix will be a 10x10 matrix with 0's on the diagonals. I have to use this cost matrix later to solve a TSP for the coordinates given.
Function calls for both the functions and the Complete code is given below
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
Local TSP Solver- Implementing the 2-opt method to solve a TSP.
'''
import sys, os
import pandas as pd
from glob import glob
import numpy as np
from itertools import combinations_with_replacement, combinations, permutations
from haversine import haversine
import time
class Two_opt(object):
'''Using the 2-opt algoirthm to improve on the TSP solver'''
def __init__(self,arg):
'''Return the global variable cities as a numpy array.'''
self.path = arg
for f in glob(self.path+"/*.csv", recursive=True):
if 'cities' in f:
self.cities = pd.read_csv(f, usecols = ['X','Y']).values
def saving_adj_mat_method2(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(combinations_with_replacement(points,2)):
adj_mat[i]=haversine(point1, point2)
# for i in range(10):
# adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
# adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def saving_adj_mat_method1(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(permutations(points,2)):
adj_mat[i]=haversine(point1, point2)
for i in range(10):
adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def adjmat(self):
'''Function to calculate the adj Matrix/Cost matrix.'''
adj_mat1 = np.empty(10**2)
adj_mat2 = np.empty(20000**2)
adj_mat3 = np.empty(20000**2)
adj_mat4 = np.empty(20000**2)
adj_mat5 = np.empty(20000**2)
adj_mat6 = np.empty(20000**2)
adj_mat7 = np.empty(20000**2)
adj_mat8 = np.empty(20000**2)
adj_mat9 = np.empty(20000**2)
adj_mat10 = np.empty((len(self.cities)-(9*20000))**2)
#print(self.cities[:5])
print(f'Number of cities {len(self.cities)}')
# 1st 20k points
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
#2nd 20k points
adj_mat2 = self.saving_adj_mat(20000,40000)
#3rd 20k points
adj_mat3 = self.saving_adj_mat(40000,60000)
#4th 20k points
adj_mat4 = self.saving_adj_mat(60000,80000)
#5th 20k points
adj_mat5 = self.saving_adj_mat(80000,100000)
#6th 20k points
adj_mat6 = self.saving_adj_mat(100000,120000)
#7th 20k points
adj_mat7 = self.saving_adj_mat(120000,140000)
#8th 20k points
adj_mat8 = self.saving_adj_mat(140000,160000)
#9th 20k points
adj_mat9 = self.saving_adj_mat(160000,180000)
# #Last points
adj_mat10 = self.saving_adj_mat(180000,len(self.cities))
'''
#store the array
file = self.path + '/adjmat_method1.npz'
np.savez(file, adj_mat1)
file = self.path + '/adjmat_method2.npz'
np.savez(file, adj_mat1c)
def getnumpyarr(self):
'''loading the pre-calculated numpy matrices'''
datavals =
for f in glob(self.path + "/*.npz", recursive=True):
datavals.append(np.load(f))
return datavals
def getcompletemat(self, *args):
'''Get the matrix reshaped and combined to be used for the algorithm.'''
data1 = args[0]
arr1 = data1['arr_0'].reshape(10, 10)
# arr2 = data1['arr_1'].reshape(20000, 20000)
# arr3 = data1['arr_2'].reshape(20000, 20000)
# arr4 = data1['arr_3'].reshape(20000, 20000)
# arr5 = data1['arr_4'].reshape(20000, 20000)
# arr6 = data1['arr_5'].reshape(20000, 20000)
# arr7 = data1['arr_6'].reshape(20000, 20000)
# arr8 = data1['arr_7'].reshape(20000, 20000)
# arr9 = data2['arr_0'].reshape(20000, 20000)
# arr10 = data2['arr_1'].reshape((len(self.cities)-(9*20000)), (len(self.cities)-(9*20000)))
print(arr1)
if __name__ == "__main__":
wd = os.getcwd()
path = wd.split("/")
newpath = "/".join(path[:-1])
check_npz=False
for f in glob(newpath + "/*.npz", recursive=True):
check_npz=True
obj=Two_opt(newpath)
if not check_npz:obj.adjmat()
datalist = obj.getnumpyarr()
for i in range(len(datalist)):
print(f'data present n data1 - {datalist[i]} n')
print(f'keys of data1 {datalist[i].files}')
obj.getcompletemat(datalist[i])
@Graipher gave a solution on which the 1st function call is based on, Correct me if I am wrong - I could have put this in the comments but cannot as I do not have enough reputation here. With combinations_with_replacement
we will not get all combinations for a square-cost matrix? Please tell me if I am missing something here.
combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC,
It does not give us the points, (BA) and (CA, CB). So if we have 10
points, we will only get 10, 9, 8, 7, ...1 and so on unique
combinations to fill the cost matrix. That will not give the correct
cost/distance matrix
Instead, I used permutations(), to get the correct and symmetric matrix, but this might not be optimal time-complexity. I need some advice here.
Output for 1st Function call - This does not give the correct matrix, as expected, this first function uses combinations_with_replacement, this is faster but gives the wrong answer
[[ 0.00000000e+000 1.27332308e+004 6.57898227e+003 5.86284373e+003
3.87871486e+003 3.67382793e+003 9.38784305e+003 6.46237005e+003
6.51036438e+003 5.43284410e+003]
[ 0.00000000e+000 1.62585887e+004 8.60903414e+003 1.41825246e+004
1.16056316e+004 1.70335895e+004 6.30445511e+003 1.14165602e+004
1.44391026e+004 0.00000000e+000]
[ 8.21256846e+003 7.62939803e+003 9.52888602e+003 6.59919871e+003
1.16551154e+004 1.11473488e+004 8.20792339e+003 0.00000000e+000
9.70389701e+003 8.46124341e+003]
[ 1.43339885e+004 3.99278066e+003 1.13084634e+004 1.12586665e+004
0.00000000e+000 2.61777230e+003 6.51291219e+003 9.11701996e+003
3.57624326e+003 1.55720772e+003]
[ 0.00000000e+000 8.99206700e+003 6.76801760e+003 2.97848880e+003
3.57521856e+003 0.00000000e+000 1.56060528e+004 7.79162477e+003
5.44506378e+003 0.00000000e+000]
[ 8.75252369e+003 1.03348289e+004 0.00000000e+000 3.06388021e+003
0.00000000e+000 4.68834210e-310 1.08566335e+012 6.90532443e-310
4.68834210e-310 1.44020725e+001]
[ 6.90532443e-310 4.68834210e-310 1.78169586e-267 6.90532443e-310
6.90532400e-310 6.88233445e-321 4.68834280e-310 4.68834246e-310
0.00000000e+000 0.00000000e+000]
[ 2.02566915e-322 -7.59297323e+303 1.08823978e-269 -3.29118277e+304
-6.17074930e+303 4.23627391e-183 1.51575303e-207 1.11491862e-187
-1.02851904e+304 -2.02003913e-269]
[ 6.90533811e-310 6.90533890e-310 4.02018951e-197 6.90532598e-310
4.68834210e-310 -1.47320674e-196 6.90532598e-310 4.68834210e-310
7.05462116e+117 6.90532598e-310]
[ 6.90533816e-310 2.47424187e+257 6.90532720e-310 4.68834211e-310
-4.20571524e-040 6.90532554e-310 4.68834210e-310 -1.76018832e+290
6.90532554e-310 4.68834210e-310]]
Output for 2nd Function call - This gives the correct answer, it is way slower for a large number of points, so it not optimal and needs to be checked for better time complexity
[[ 0. 12733.23081175 6578.98226932 5862.84372969
3878.71485682 3673.82793216 9387.84304626 6462.37004961
6510.36438472 5432.84409789]
[12733.23081175 0. 16258.5887252 8609.03413858
14182.52456514 11605.63163523 17033.5895249 6304.4551083
11416.5602085 14439.1026007 ]
[ 6578.98226932 16258.5887252 0. 8212.56845897
7629.39802629 9528.88602263 6599.19870899 11655.11542038
11147.34880286 8207.9233913 ]
[ 5862.84372969 8609.03413858 8212.56845897 0.
9703.89700789 8461.24341497 14333.98852641 3992.7806584
11308.46341052 11258.66649267]
[ 3878.71485682 14182.52456514 7629.39802629 9703.89700789
0. 2617.77230465 6512.91218854 9117.01995685
3576.24326113 1557.20772372]
[ 3673.82793216 11605.63163523 9528.88602263 8461.24341497
2617.77230465 0. 8992.06700473 6768.01759819
2978.48879937 3575.21855563]
[ 9387.84304626 17033.5895249 6599.19870899 14333.98852641
6512.91218854 8992.06700473 0. 15606.05278166
7791.62477038 5445.06378019]
[ 6462.37004961 6304.4551083 11655.11542038 3992.7806584
9117.01995685 6768.01759819 15606.05278166 0.
8752.52368527 10334.82891513]
[ 6510.36438472 11416.5602085 11147.34880286 11308.46341052
3576.24326113 2978.48879937 7791.62477038 8752.52368527
0. 3063.88021254]
[ 5432.84409789 14439.1026007 8207.9233913 11258.66649267
1557.20772372 3575.21855563 5445.06378019 10334.82891513
3063.88021254 0. ]]
python-3.x
New contributor
Link to the code/idea that is used here - Extracting an adjaceny matrix containing haversine distance from points on map
Note: This is a question that is based on the link above, where the cost matrix is calculated using combinations_with_replacement
, which is not giving me the correct results.
What is the purpose of the code below :
Square Cost/Distance Matrix or Adjacent Matrix of points, each point constitutes of coordinates, The reason I am writing the code is to solve the TSP for nearly 200k points to get the optimal solution, I need to get the cost matrix and then apply the 2-opt or 3-opt algorithm on top of it
So let's say as below if we have 10 points our cost matrix will be a 10x10 matrix with 0's on the diagonals. I have to use this cost matrix later to solve a TSP for the coordinates given.
Function calls for both the functions and the Complete code is given below
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
Local TSP Solver- Implementing the 2-opt method to solve a TSP.
'''
import sys, os
import pandas as pd
from glob import glob
import numpy as np
from itertools import combinations_with_replacement, combinations, permutations
from haversine import haversine
import time
class Two_opt(object):
'''Using the 2-opt algoirthm to improve on the TSP solver'''
def __init__(self,arg):
'''Return the global variable cities as a numpy array.'''
self.path = arg
for f in glob(self.path+"/*.csv", recursive=True):
if 'cities' in f:
self.cities = pd.read_csv(f, usecols = ['X','Y']).values
def saving_adj_mat_method2(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(combinations_with_replacement(points,2)):
adj_mat[i]=haversine(point1, point2)
# for i in range(10):
# adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
# adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def saving_adj_mat_method1(self, ind1, ind2):
'''Utility funtion to Calculate the haversine distance for the cost matrix.'''
start=time.time()
adj_mat = np.empty(10**2)
points = self.cities[ind1:ind2]
for i, (point1, point2) in enumerate(permutations(points,2)):
adj_mat[i]=haversine(point1, point2)
for i in range(10):
adj_mat = np.concatenate((adj_mat[0:((10*i)+i)],[haversine(points[i], points[i])],adj_mat[((10*i)+i):]), axis=0)
adj_mat=adj_mat[:10**2]
adj_mat.reshape(10, 10)
end=time.time()
print(f'Time taken for {round(float(end-start),3)}')
return adj_mat
def adjmat(self):
'''Function to calculate the adj Matrix/Cost matrix.'''
adj_mat1 = np.empty(10**2)
adj_mat2 = np.empty(20000**2)
adj_mat3 = np.empty(20000**2)
adj_mat4 = np.empty(20000**2)
adj_mat5 = np.empty(20000**2)
adj_mat6 = np.empty(20000**2)
adj_mat7 = np.empty(20000**2)
adj_mat8 = np.empty(20000**2)
adj_mat9 = np.empty(20000**2)
adj_mat10 = np.empty((len(self.cities)-(9*20000))**2)
#print(self.cities[:5])
print(f'Number of cities {len(self.cities)}')
# 1st 20k points
adj_mat1 = self.saving_adj_mat_method1(0,10)
adj_mat1c = self.saving_adj_mat_method2(0,10)
'''
#2nd 20k points
adj_mat2 = self.saving_adj_mat(20000,40000)
#3rd 20k points
adj_mat3 = self.saving_adj_mat(40000,60000)
#4th 20k points
adj_mat4 = self.saving_adj_mat(60000,80000)
#5th 20k points
adj_mat5 = self.saving_adj_mat(80000,100000)
#6th 20k points
adj_mat6 = self.saving_adj_mat(100000,120000)
#7th 20k points
adj_mat7 = self.saving_adj_mat(120000,140000)
#8th 20k points
adj_mat8 = self.saving_adj_mat(140000,160000)
#9th 20k points
adj_mat9 = self.saving_adj_mat(160000,180000)
# #Last points
adj_mat10 = self.saving_adj_mat(180000,len(self.cities))
'''
#store the array
file = self.path + '/adjmat_method1.npz'
np.savez(file, adj_mat1)
file = self.path + '/adjmat_method2.npz'
np.savez(file, adj_mat1c)
def getnumpyarr(self):
'''loading the pre-calculated numpy matrices'''
datavals =
for f in glob(self.path + "/*.npz", recursive=True):
datavals.append(np.load(f))
return datavals
def getcompletemat(self, *args):
'''Get the matrix reshaped and combined to be used for the algorithm.'''
data1 = args[0]
arr1 = data1['arr_0'].reshape(10, 10)
# arr2 = data1['arr_1'].reshape(20000, 20000)
# arr3 = data1['arr_2'].reshape(20000, 20000)
# arr4 = data1['arr_3'].reshape(20000, 20000)
# arr5 = data1['arr_4'].reshape(20000, 20000)
# arr6 = data1['arr_5'].reshape(20000, 20000)
# arr7 = data1['arr_6'].reshape(20000, 20000)
# arr8 = data1['arr_7'].reshape(20000, 20000)
# arr9 = data2['arr_0'].reshape(20000, 20000)
# arr10 = data2['arr_1'].reshape((len(self.cities)-(9*20000)), (len(self.cities)-(9*20000)))
print(arr1)
if __name__ == "__main__":
wd = os.getcwd()
path = wd.split("/")
newpath = "/".join(path[:-1])
check_npz=False
for f in glob(newpath + "/*.npz", recursive=True):
check_npz=True
obj=Two_opt(newpath)
if not check_npz:obj.adjmat()
datalist = obj.getnumpyarr()
for i in range(len(datalist)):
print(f'data present n data1 - {datalist[i]} n')
print(f'keys of data1 {datalist[i].files}')
obj.getcompletemat(datalist[i])
@Graipher gave a solution on which the 1st function call is based on, Correct me if I am wrong - I could have put this in the comments but cannot as I do not have enough reputation here. With combinations_with_replacement
we will not get all combinations for a square-cost matrix? Please tell me if I am missing something here.
combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC,
It does not give us the points, (BA) and (CA, CB). So if we have 10
points, we will only get 10, 9, 8, 7, ...1 and so on unique
combinations to fill the cost matrix. That will not give the correct
cost/distance matrix
Instead, I used permutations(), to get the correct and symmetric matrix, but this might not be optimal time-complexity. I need some advice here.
Output for 1st Function call - This does not give the correct matrix, as expected, this first function uses combinations_with_replacement, this is faster but gives the wrong answer
[[ 0.00000000e+000 1.27332308e+004 6.57898227e+003 5.86284373e+003
3.87871486e+003 3.67382793e+003 9.38784305e+003 6.46237005e+003
6.51036438e+003 5.43284410e+003]
[ 0.00000000e+000 1.62585887e+004 8.60903414e+003 1.41825246e+004
1.16056316e+004 1.70335895e+004 6.30445511e+003 1.14165602e+004
1.44391026e+004 0.00000000e+000]
[ 8.21256846e+003 7.62939803e+003 9.52888602e+003 6.59919871e+003
1.16551154e+004 1.11473488e+004 8.20792339e+003 0.00000000e+000
9.70389701e+003 8.46124341e+003]
[ 1.43339885e+004 3.99278066e+003 1.13084634e+004 1.12586665e+004
0.00000000e+000 2.61777230e+003 6.51291219e+003 9.11701996e+003
3.57624326e+003 1.55720772e+003]
[ 0.00000000e+000 8.99206700e+003 6.76801760e+003 2.97848880e+003
3.57521856e+003 0.00000000e+000 1.56060528e+004 7.79162477e+003
5.44506378e+003 0.00000000e+000]
[ 8.75252369e+003 1.03348289e+004 0.00000000e+000 3.06388021e+003
0.00000000e+000 4.68834210e-310 1.08566335e+012 6.90532443e-310
4.68834210e-310 1.44020725e+001]
[ 6.90532443e-310 4.68834210e-310 1.78169586e-267 6.90532443e-310
6.90532400e-310 6.88233445e-321 4.68834280e-310 4.68834246e-310
0.00000000e+000 0.00000000e+000]
[ 2.02566915e-322 -7.59297323e+303 1.08823978e-269 -3.29118277e+304
-6.17074930e+303 4.23627391e-183 1.51575303e-207 1.11491862e-187
-1.02851904e+304 -2.02003913e-269]
[ 6.90533811e-310 6.90533890e-310 4.02018951e-197 6.90532598e-310
4.68834210e-310 -1.47320674e-196 6.90532598e-310 4.68834210e-310
7.05462116e+117 6.90532598e-310]
[ 6.90533816e-310 2.47424187e+257 6.90532720e-310 4.68834211e-310
-4.20571524e-040 6.90532554e-310 4.68834210e-310 -1.76018832e+290
6.90532554e-310 4.68834210e-310]]
Output for 2nd Function call - This gives the correct answer, it is way slower for a large number of points, so it not optimal and needs to be checked for better time complexity
[[ 0. 12733.23081175 6578.98226932 5862.84372969
3878.71485682 3673.82793216 9387.84304626 6462.37004961
6510.36438472 5432.84409789]
[12733.23081175 0. 16258.5887252 8609.03413858
14182.52456514 11605.63163523 17033.5895249 6304.4551083
11416.5602085 14439.1026007 ]
[ 6578.98226932 16258.5887252 0. 8212.56845897
7629.39802629 9528.88602263 6599.19870899 11655.11542038
11147.34880286 8207.9233913 ]
[ 5862.84372969 8609.03413858 8212.56845897 0.
9703.89700789 8461.24341497 14333.98852641 3992.7806584
11308.46341052 11258.66649267]
[ 3878.71485682 14182.52456514 7629.39802629 9703.89700789
0. 2617.77230465 6512.91218854 9117.01995685
3576.24326113 1557.20772372]
[ 3673.82793216 11605.63163523 9528.88602263 8461.24341497
2617.77230465 0. 8992.06700473 6768.01759819
2978.48879937 3575.21855563]
[ 9387.84304626 17033.5895249 6599.19870899 14333.98852641
6512.91218854 8992.06700473 0. 15606.05278166
7791.62477038 5445.06378019]
[ 6462.37004961 6304.4551083 11655.11542038 3992.7806584
9117.01995685 6768.01759819 15606.05278166 0.
8752.52368527 10334.82891513]
[ 6510.36438472 11416.5602085 11147.34880286 11308.46341052
3576.24326113 2978.48879937 7791.62477038 8752.52368527
0. 3063.88021254]
[ 5432.84409789 14439.1026007 8207.9233913 11258.66649267
1557.20772372 3575.21855563 5445.06378019 10334.82891513
3063.88021254 0. ]]
python-3.x
python-3.x
New contributor
New contributor
edited 58 mins ago
New contributor
asked yesterday
Sandeep Anand
12
12
New contributor
New contributor
put on hold as unclear what you're asking by Zeta, alecxe, Jamal♦ 20 hours ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
put on hold as unclear what you're asking by Zeta, alecxe, Jamal♦ 20 hours ago
Please clarify your specific problem or add additional details to highlight exactly what you need. As it's currently written, it’s hard to tell exactly what you're asking. See the How to Ask page for help clarifying this question. If this question can be reworded to fit the rules in the help center, please edit the question.
2
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
2
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday
add a comment |
2
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
2
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday
2
2
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
2
2
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
2
Welcome to Code Review. Unfortunately, it's not clear what your code does, and whether it actually works. Please edit your code to add the task you've implemented. Also, the current question title, which states your concerns about the code, doesn't show confidence that the code actually works. The site standard is for the title to simply state the task accomplished by the code. Please see How to Ask for examples, and revise the title accordingly.
– Zeta
yesterday
2
You can comment on your own posts even if you don't have the reputation to comment elsewhere.
– bruglesco
yesterday