Filter_bubble

Various personalization algorithms are applied in e-commerce and other kinds of websites/apps in order to increase purchases or user engagement.

The following time series were collected over a period of 20 months from a large online retail store. In this study, during the first 10 months (normalized as the period from month -10 to month 0) a contextual personalization algorithm was applied in order to increase user engagement (measured as the accumulated number of ‘Likes’ on products shared via Facebook). As in the case with many contextual approaches, this algorithm suffered from the ‘filter bubble’ problem (https://en.wikipedia.org/wiki/Filter_bubble), …


Introduction to Genetic Algorithm (Unsupervised Learning): generate a Rose

A mon amie la Rose!

The goal is to start from a random image and generate a picture close to the model using AI


Random, Epsilon-greedy, UCB bandit

Multi-armed bandit is a problem of choosing between alternative options with unknown rewards, trying to maximize your expected reward and to learn it at the same time. A wiki article.

The entire notebook is available here

We’ll create a simple environment for simulating such problems and try a couple of strategies dealing with them.

Base

  • Set up library
  • Implement a class to generate rewards for each action taken by the solver
  • Implement a base class for different problem solvers
  • Write a function that puts our solver into the environment

Random

Let’s try a simple solver…


Let’s make it confortable

photo by Anders Jildén on unsplash

A demo is available here. Below the process to emulate a terminal in your Google Colab.

Part 1

Just copy past this code in a cell of your colab and run:

from IPython.display import JSON
from google.colab import output
from subprocess import getoutput
import os
def shell(command):
if command.startswith('cd'):
path = command.strip().split(maxsplit=1)[1]
os.chdir(path)
return JSON([''])
return JSON([getoutput(command)])
output.register_callback('shell', shell)

Part 2

In another cell copy/past/run the following:

#@title Colab Shell
%%html
<div id=term_demo></div>
<script src="https://code.jquery.com/jquery-latest.js"></script>
<script src="https://cdn.jsdelivr.net/npm/jquery.terminal/js/jquery.terminal.min.js"></script>
<link href="https://cdn.jsdelivr.net/npm/jquery.terminal/css/jquery.terminal.min.css" rel="stylesheet"/>
<script>
$('#term_demo').terminal(async function(command) {
if (command !== '') {…

Classical interview question explained

Photo by Glen Carrie on unsplash

First let’s agree on if the strings doesn’t have the same size, fhey are different.

Solution 1

We will use sorted from Python library. If the difference was just permutation(s): sorting them should make them equal.

def permutation1(str1, str2):
if len(str1) != len(str2):
return False
return ""
.join(sorted(str1)) == "".join(sorted(str2))

sorted: Python uses timsort function which runs in O(n) in best case and O(n log n) in average/worst case.

join: string are immutable in Python, the entire strings need to be copied. The complexity O(n) where n is the size of the output string

Solution 2

We are assuming the string…


What if you cannot use additional data structures?

photo by Fabian Grohs on unsplash

This is the first question of the famous book Cracking the Coding Interview.
The book offers only a Java solution.
Here are my implementation in Python assuming the string is in ASCII (since it is ASCII the alphabet has 128 characters)

Solution 1 — First using a Set

We are evaluating each character of the string. If we already visited the character we stop otherwise we add it to a set and continue

def unique_with_set(str_input):
if len(str_input) > 128:
return False
data = set()
for c in str_input:
if c in data:
return False
else
:
data.add(c)
return True

complexity…


Tutorial to predict clicks

This post is introduction to recommender system.
Next step will be a Kaggle submission via Jupyter

Tech stack on this post:

  1. Jupyter
  2. Vowpal Wabbit
  3. Spark
  4. Parquet

Vowpal Wabbit is an open-source fast online interactive machine learning system library and program developed originally at Yahoo! Research, and currently at Microsoft Research

We will use data of the following Kaggle competition outbrain-click-prediction. It’s an old competition but a good match for our need.

Let’s import classical ML library:

import tqdm.notebook as tqdm
import numpy as np
import scipy
import sklearn
import matplotlib.pyplot as plt

Let’s also import Spark

Samuel Guedj

Data Scientist. DevOps.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store