MeanNearestNeighbors (MNN) - algorithm for balancing dataset - In progress #1

Image
One of the challenges in classification problems are unbalanced datasets. I was Data Science Intern when the company that I worked for, assigned me such an interesting challenge where the dataset was unbalanced.  However, I realized this type of problem like unbalanced dataset is а common thing in real life. I tried most of the algorithms (undersampling, oversampling) like SMOTE, NearMiss, CondensedNearestNeighbors, RandomUnderSampler, RandomOverSampler,  KMeansSMOTЕ and rest of them. Anyway, they didn't help me in that case, on the contrary, they worsened my model.  I was like: "but, but, you should have been helpful in creating the predictive model" So, I'm trying to create another algorithm based on undersampling concept when it comes to balancing datasets. I called it Mean Nearest Neighbors (MNN). What's the initial idea: It's simple. Actually, the algorithm is just a modification of the other undersampling algorithms. In the data where target labe...

Competitive Programming #4: Microsoft -> [Nth item through sum]

This task is from Microsoft Interview:

Given two sorted arrays, we can get a set of sums(add one element from the first and one from second). Find the N’th element in the set considered in sorted order.
Note: Set of sums should have unique elements.



Input:
The first line of input contains an integer T denoting tghe number of test cases. Then T test cases follow. Each test case contains two integers m and n denoting the size of the two arrays respectively. The next two lines contains m and n space separated elements forming both the arrays respectively. The last line contains the value of N.

Output:
Print the value of Nth element in set. If Nth element doesn't exist in set print -1.

Constraints:
1<=T<=10^5
1<=m,n<=10^5
1<=arr1[i],arr2[j]<=10^5
1<=N<=m*n

Example:
Input:

2
2 2
1 2
3 4
3
5 4
1 3 4 8 10
20 22 30 40
4

Output:
6
25


Solution:
This would be pretty easy task if m,n <=10^3 ... but it is 10^5  . So we don't use algorithm with time complexity O(n^2). We gonna use here binary search ... It easier cause we have already sorted arrays and it will be fast. We use library 'set' from (STL) ,Cause in 'set'  hasn't repeating sums . It's only one and unique sum.
Firstly , we must find the smallest sum and the biggest sum .
The minimal sum is = n1[0]+m1[0];
The maximum sum is = n1[n-1] + m1[m-1]
These last variable (maximum sum) is very important variable for condition:
"if(sum > m1[m-1]+n1[n-1])