airbnb.io - Aerosolve A machine learning package built for humans4,385









Search Preview

Airbnb Engineering & Data Science

airbnb.io
Airbnb.io Open SourceEventsBlogGithubCareersAirbnb.io AerosolveA machine learning package built for humans4,385By Hector YeeA machine learning library des
.io > airbnb.io

SEO audit: Content analysis

Language Error! No language localisation is found.
Title Airbnb Engineering & Data Science
Text / HTML ratio 44 %
Frame Excellent! The website does not use iFrame solutions.
Flash Excellent! The website does not have any flash contents.
Keywords cloud feature model features transforms transform models family context linear interpretable spline > items loss code ranking Feature search learning machine
Keywords consistency
Keyword Content Title Description Headings
feature 19
model 16
features 13
transforms 9
transform 8
models 8
Headings
H1 H2 H3 H4 H5 H6
1 0 0 6 0 0
Images We found 1 images on this web page.

SEO Keywords (Single)

Keyword Occurrence Density
feature 19 0.95 %
model 16 0.80 %
features 13 0.65 %
transforms 9 0.45 %
transform 8 0.40 %
models 8 0.40 %
family 7 0.35 %
context 7 0.35 %
linear 7 0.35 %
interpretable 7 0.35 %
spline 6 0.30 %
> 5 0.25 %
items 5 0.25 %
loss 5 0.25 %
code 4 0.20 %
ranking 4 0.20 %
Feature 4 0.20 %
search 4 0.20 %
learning 4 0.20 %
machine 4 0.20 %

SEO Keywords (Two Word)

Keyword Occurrence Density
feature family 7 0.35 %
this is 6 0.30 %
in the 6 0.30 %
you can 5 0.25 %
for the 5 0.25 %
is a 5 0.25 %
of feature 4 0.20 %
kind of 4 0.20 %
such as 4 0.20 %
of features 4 0.20 %
machine learning 4 0.20 %
as a 4 0.20 %
the linear 4 0.20 %
in a 4 0.20 %
linear model 3 0.15 %
ranking loss 3 0.15 %
that are 3 0.15 %
is not 3 0.15 %
family to 3 0.15 %
This allows 3 0.15 %

SEO Keywords (Three Word)

Keyword Occurrence Density Possible Spam
this is a 5 0.25 % No
of feature family 3 0.15 % No
transforms for the 3 0.15 % No
a map of 3 0.15 % No
feature family to 3 0.15 % No
is a map 3 0.15 % No
what kind of 3 0.15 % No
map of feature 3 0.15 % No
dives into the 2 0.10 % No
feature transform language 2 0.10 % No
section dives into 2 0.10 % No
a search session 2 0.10 % No
new feature family 2 0.10 % No
the linear model 2 0.10 % No
features such as 2 0.10 % No
interpretable features such 2 0.10 % No
pairwise ranking loss 2 0.10 % No
thrift based feature 2 0.10 % No
based feature representation 2 0.10 % No
This section dives 2 0.10 % No

SEO Keywords (Four Word)

Keyword Occurrence Density Possible Spam
a map of feature 3 0.15 % No
this is a map 3 0.15 % No
map of feature family 3 0.15 % No
of feature family to 3 0.15 % No
is a map of 3 0.15 % No
interpretable features such as 2 0.10 % No
thrift based feature representation 2 0.10 % No
section dives into the 2 0.10 % No
This section dives into 2 0.10 % No
image content analysis code 2 0.10 % No
that specifies other transforms 1 0.05 % No
specifies other transforms to 1 0.05 % No
other transforms to be 1 0.05 % No
transform that specifies other 1 0.05 % No
to be applied Cross 1 0.05 % No
be applied Cross transform 1 0.05 % No
applied Cross transform Operates 1 0.05 % No
meta transform that specifies 1 0.05 % No
A meta transform that 1 0.05 % No
Cross transform Operates only 1 0.05 % No

Internal links in - airbnb.io

Open Source
Airbnb Engineering & Data Science
Events
Airbnb Engineering & Data Science
Airflow Use Apache Airflow (incubating) to author workflows as directed acyclic graphs (DAGs) of tasks8,613
Airbnb Engineering & Data Science
Aerosolve A machine learning package built for humans4,385
Airbnb Engineering & Data Science
BinaryAlert Serverless real-time and retroactive malware detection705
Airbnb Engineering & Data Science
AirMapView A view abstraction to provide a map user interface with various underlying map providers1,583
Airbnb Engineering & Data Science
Airpal Web UI for PrestoDB2,345
Airbnb Engineering & Data Science
DeepLinkDispatch Easy declaration and routing of your deep links2,990
Airbnb Engineering & Data Science
Enzyme JavaScript Testing utilities for React14,372
Airbnb Engineering & Data Science
Hammerspace Hash-like interface to persistent, concurrent, off-heap storage302
Airbnb Engineering & Data Science
Hypernova A service for server-side rendering your JavaScript views4,266
Airbnb Engineering & Data Science
Epoxy An Android library for building complex screens in a RecyclerView4,301
Airbnb Engineering & Data Science
Interferon Signaling you about infrastructure or application issues178
Airbnb Engineering & Data Science
JavaScript Style Guide A mostly reasonable approach to JavaScript73,101
Airbnb Engineering & Data Science
The Knowledge Repo A next-generation curated knowledge sharing platform for data scientists and other technical professions2,992
Airbnb Engineering & Data Science
Polyglot Give your JavaScript the ability to speak many languages2,734
Airbnb Engineering & Data Science
Nerve A service registration daemon that performs health checks; companion to airbnb/synapse837
Airbnb Engineering & Data Science
mocha-wrap Fluent pluggable interface for easily wrapping `describe` and `it` blocks in Mocha tests39
Airbnb Engineering & Data Science
Rheostat Rheostat is a www, mobile, and accessible slider component built with React1,248
Airbnb Engineering & Data Science
ReAir A collection of easy-to-use tools for replicating tables and partitions between Hive data warehouses206
Airbnb Engineering & Data Science
react-dates An accessible, easily internationalizable, mobile-friendly datepicker library for the web7,757
Airbnb Engineering & Data Science
Ruby Style Guide Airbnb's Ruby Style Guide2,366
Airbnb Engineering & Data Science
react-with-styles Use CSS-in-JavaScript with themes for React without being tightly coupled to one implementation1,242
Airbnb Engineering & Data Science
Superset Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application19,842
Airbnb Engineering & Data Science
RxGroups Easily group RxJava Observables together and tie them to your Android Activity lifecycle668
Airbnb Engineering & Data Science
StreamAlert A serverless framework for real-time data analysis and alerting1,406
Airbnb Engineering & Data Science
Stemcell Airbnb's EC2 instance creation and bootstrapping tool157
Airbnb Engineering & Data Science
Synapse A transparent service discovery framework for connecting an SOA1,830
Airbnb Engineering & Data Science

Airbnb.io Spined HTML


Airbnb Engineering & Data Science Airbnb.io Open SourceEventsBlogGithubCareersAirbnb.io AerosolveA machine learning package built for humans4,385By Hector YeeA machine learning library designed from the ground up to be human friendly. It is variegated from other machine learning libraries in the pursuit ways: A thrift based full-length representation that enables pairwise ranking loss and single context multiple item representation. A full-length transform language gives the user a lot of tenancy over the features Human friendly debuggable models Separate lightweight Java inference lawmaking Scala lawmaking for training Simple image content wringer lawmaking suitable for ordering or ranking images This library is meant to be used with sparse, interpretable features such as those that wontedly occur in search (search keywords, filters) or pricing (number of rooms, location, price). It is not as interpretable with problems with very dumbo non-human interpretable features such as raw pixels or audio samples. The are a few reasons to focus on interpretability: Your corpus is new and not fully specified and you want increasingly insight into your corpus Having interpretable models lets you iterate quickly. Figure out where the model disagrees most and have insight into what kind of new features are needed. Debugging noisy features. By plotting the full-length weights you can discover buggy features or fit them to splines and discover features that are unexpectedly ramified (which usually indicates overfitting). You can discover relationships between variegated variables and your target prediction. e.g. For the Airbnb demand model, plotting graphs of reviews and 3-star reviews is increasingly interpretable than many nested if then else rules. How to get started? The artifacts for aerosolve are hosted on bintray. If you use Maven, SBT or Gradle you can just point to bintray as a repository and automatically fetch the artifacts. Check out the image impression demo where you can learn how to teach the algorithm to paint in the pointilism style of painting. Image Impressionism Demo. There is moreover an income prediction demo based on a popular machine learning benchmark. Income Prediction Demo.Full-lengthRepresentation This section dives into the thrift based full-length representation. Features are grouped into logical groups tabbed families of features. The reason for this is so we can express transformations on an unshortened full-length family at once or interact two variegated families of features together to create a new full-length family. There are three kinds of features per FeatureVector: stringFeatures - this is a map of full-length family to binary full-length strings. For example "GEO" -> { "San Francisco", "CA", "USA" } floatFeatures - this is a map of full-length family to full-length name and value. For example "LOC" -> { "Latitude" : 37.75, "Longitude" : -122.43 } denseFeatures - this is a map of full-length family to a dumbo variety of floats. Not really used except for the image content wringer code. Example Representation Examples are the vital unit of creating training data and scoring. A single example is well-balanced of: context - this is a FeatureVector that occurs once in the example. It could be the features representing a search session for example. e.g. "Keyword" -> "Free parking" example(0..N) - this is a repeated list of FeatureVectors that represent the items stuff scored. These can correspond to documents in a search session. e.g. "LISTING CITY" -> "San Francisco" The reasons for having this structure are: having one context for hundreds of items saves a lot of space during RPCs or plane on disk you can compute the transforms for the context once, then wield the transformed context repeatedly in conjunction with each item having a list of items allows the use of list based loss functions such as pairwise ranking loss, domination loss etc where we evaluate multiple items at onceFull-lengthTransform language This section dives into the full-length transform languageFull-lengthtransforms are unromantic with a separate transformer module that is decoupled from the model. This allows the user to unravel untied transforms or transform data superiority of time of scoring for example. e.g. in an using the items in a corpus may be transformed superiority of time and stored, while the context is not known until runtime. Then at runtime, one can transform the context and combined them with each transformed item to get the final full-length vector that is then fed to the models.Full-lengthtransforms indulge us to modify FeatureVectors on the fly. This allows engineers to rapidly iterate on full-length engineering quickly and in a controlled way. Here are some examples of full-length transforms that are wontedly used: List transform. A meta transform that specifies other transforms to be unromanticNavigatetransform. Operates only on stringFeatures. Allows interactions between two variegated string full-length familys. e.g. "Keyword" navigate "LISTING CITY" creates the new full-length family "Keywordxcity" -> "Free parking^San Francisco" Multiscale grid transform Constructs multiple nested grids for 2D coordinates. Useful for modelling geography. Please see the respective unit tests as to what these transforms do, what kind of features they operate on and what kind of config they expect. Models This section covers debuggable models Although there are several models in the model directory only two are the main debuggable models. The rest are experimental or sub-models that create transforms for the interpretable models. Linear model. Supports hinge, logistic, epsilon insensitive regression, ranking loss functions. Only operates on stringFeatures. The label for the task is stored in a special full-length family and specified by rank_key in the config. See the linear model unit tests on how to set up the models. Note that in conjuction with quantization and crosses you can get incredible amounts of complexity from the "linear" model, so it is not unquestionably your regular linear model but something increasingly ramified and can be thought of as a bushy, very wide visualization tree with millions of branches. Spline model. A unstipulated ingredient linear piecewise spline model. The training is washed-up at a higher resolution specified by num_buckets between the min and max of a feature's range. At the end of each iteration we struggle to project the linear piecewise spline into a lower dimensional function such as a polynomial spline with Dirac delta endpoints. If the RMSE of the projection is whilom threshold, we leave the spline vacated in the upper resolution piecewise linear mode. This allows us to debug the spline model for features that are buggy or unexpectly ramified (e.g. jumping up and lanugo when we expect some kind of smoothness) Boosted stumps model - small meaty model. Not very interpretable but at small sizes useful for full-length selection.Visualizationtree model - in memory only. Mostly used to generate transforms for the linear or spline model. Maxout neural network model. Experimental and mostly used as a comparison baseline. Support User group: https://groups.google.com/forum/#!forum/aerosolve-usersLinksGithubUser GroupRead More© Airbnb, Inc.© Airbnb, Inc.CareersDesignCareersDesignCareersDesign