Posts

Showing posts from July, 2018

MeanNearestNeighbors (MNN) - algorithm for balancing dataset - In progress #1

Image
One of the challenges in classification problems are unbalanced datasets. I was Data Science Intern when the company that I worked for, assigned me such an interesting challenge where the dataset was unbalanced.  However, I realized this type of problem like unbalanced dataset is а common thing in real life. I tried most of the algorithms (undersampling, oversampling) like SMOTE, NearMiss, CondensedNearestNeighbors, RandomUnderSampler, RandomOverSampler,  KMeansSMOTŠ• and rest of them. Anyway, they didn't help me in that case, on the contrary, they worsened my model.  I was like: "but, but, you should have been helpful in creating the predictive model" So, I'm trying to create another algorithm based on undersampling concept when it comes to balancing datasets. I called it Mean Nearest Neighbors (MNN). What's the initial idea: It's simple. Actually, the algorithm is just a modification of the other undersampling algorithms. In the data where target labe...

Top 10 technologies to learn in 2018 >

Image

5 Principles of Programming that Every Coder Must Know

Image
Below are some of the useful principles of programming that you must keep in mind while writing code. 1. Simplicity Simplicity is the ultimate sophistication and perhaps nowhere more than in programming. It all begins with how you document and dissect program requirements. Each requirement should be well articulated to the extent that once you start to code, you can satisfy these requirements using the simplest of techniques. Complex code not only takes more time to design and write but is also more vulnerable to errors and bugs. A labyrinth of code can make  web app monitoring  tedious. Beware of feature creep where you start to add new features to the program that the customer didn’t ask for as this only needlessly entangles the software. 2. Do Not Repeat Yourself Minimal repetition is a sign of quality code. Avoid duplicating logic and data. To know whether your program has excessive repetition, think about how much code you’d need to modify if you wanted to alte...