Posts

Showing posts from February, 2018

MeanNearestNeighbors (MNN) - algorithm for balancing dataset - In progress #1

Image
One of the challenges in classification problems are unbalanced datasets. I was Data Science Intern when the company that I worked for, assigned me such an interesting challenge where the dataset was unbalanced.  However, I realized this type of problem like unbalanced dataset is а common thing in real life. I tried most of the algorithms (undersampling, oversampling) like SMOTE, NearMiss, CondensedNearestNeighbors, RandomUnderSampler, RandomOverSampler,  KMeansSMOTЕ and rest of them. Anyway, they didn't help me in that case, on the contrary, they worsened my model.  I was like: "but, but, you should have been helpful in creating the predictive model" So, I'm trying to create another algorithm based on undersampling concept when it comes to balancing datasets. I called it Mean Nearest Neighbors (MNN). What's the initial idea: It's simple. Actually, the algorithm is just a modification of the other undersampling algorithms. In the data where target labe...

Competitive Programming #25: [Check if a number is divisible by 8]

Image
Given a number n, check if it is divisible by 8. Input: The first line of the input contains an integer T denoting the number of test cases. For each test case, there is an integer n  whose divisibility we need to check.  Output: For each test case, the output is 1 if the number is divisible by 8 else -1. Constraints: 1<=T<=100 1<=digits in n Example: Input: 2 16 15 Output: 1 -1   -------------------------------------------------------------- Solution: It easy, us e divisibility rule for 8. The last three digits of number >=3 must be divisible by 8 ... For more rules divisibility : https://en.wikipedia.org/wiki/Divisibility_rule

Competitive Programming #24: Micrsoft Problem -> [Excel Sheet | Part - 1]

Image
Given a positive integer, return its corresponding column title as appear in an Excel sheet. For example:     1 -> A     2 -> B     3 -> C     ...     26 -> Z     27 -> AA     28 -> AB  NOTE: The alphabets are all in uppercase. Input: The first line contains an integer T, depicting total number of test cases. Then following T lines contains an integer N. Output: Print the string corrosponding to the column number. Constraints: 1 ≤ T ≤ 100 1 ≤ N ≤ 10000000 Example: Input 1 51 Output AY ---------------------------------------------------------------- Solution:  

Competitive Programming #23: [Maximum Sub Array]

Image
Find out the maximum sub-array of non negative numbers from an array. The sub-array should be continuous. That is, a sub-array created by choosing the second and fourth element and skipping the third element is invalid. Maximum sub-array is defined in terms of the sum of the elements in the sub-array. Sub-array A is greater than sub-array B if sum(A) > sum(B). Example: A : [1, 2, 5, -7, 2, 3] The two sub-arrays are [1, 2, 5] [2, 3]. The answer is [1, 2, 5] as its sum is larger than [2, 3] NOTE 1: If there is a tie, then compare with segment's length and return segment which has maximum length NOTE 2: If there is still a tie, then return the segment with minimum starting index Input: The first line contains an integer T, depicting total number of test cases. Then following T lines contains an integer N depicting the size of array and next line followed by the value of array. Output: Print the Sub-array with maximum sum. Constraints: 1...