Scrappy is widely used Python web scraping library. It is used for creating crawling programs. Initially, it was designed for scraping, like its name indicate but now it used for many purposes including data mining, automated testing, etc. scrapy is open-source and must have library.
Pytorch is open source library, it basically a replacement of library Numpy and it equipped with higher-level functionality for building deep neural network. You still can use other language such as scipy, Cython, and numpy which help to extend PyTorch when required. Many organization including facebook, twitter, Nvidia, uber and other organization using Pytorch for rapid prototyping in research and to train deep learning models.
Pendulum is a python package which is used to determine pendulum. It make life a lot of easier when it comes to work with date and time. You code will still work if you replace every elements of DateTime. With Pendulum, you can parse DateTime, and display datetime with time zone. So basically Pendulum is improved version of Arrow library and it have all the handy methods like rounding, truncating, converting, parsing, formatting, and arithmetic.
Requests is one of the famous Python library which is licensed under Apache2 and written in Python. This library help humans to interact with the languages. With Request library, you don’t need to add query, string manually to your URL’s or form-encode your POST data. You can send HTTP request to server using Request library and you can add form data, content like header, multi-part files, etc.
Pyflux is a python library which is used to predict and analysis time series. It is developed by Ross Taylor, this library have many options for interface and contain many new classes of model types. Pyflux allow users to implement many modern time series models like GARCH and predict the nature that how it will react in future.
Zappa is one of best python package which is created by Miserlou, it so easy to build and implement server-less application on API Gateway and Amazon Web Services Lambda. Since AWS handling the horizontal scaling automatically, so no request going to be time out. With Zappa, you can update your code in single line with Zappa.
Arrow is a famous human-friendly Python library which offers sensible features like creating, formatting, manipulating and converting dates, times, and timestamps. It support python 2 and 3 and it is an alternative of datetime but provide rich features with nicer interface.
It a Python deep learning library, which is used to optimize, define and evaluate mathematical numerical equations and multi-dimensional array. It is developed by machine learning group, so basically, Theano is a compiler for mathematical expression and it provide tight integration with Numpy and it provide a speedy and stability optimization.
This is one of the most useful python tool, it provide a rich architecture to it user. This tool let you to write and execute the python code in your browser. Ipython works on several operating system including Windows, Mac OS X, Linux and most other Unix OS. Ipython give you all the feature that you will get in the basic interpreter with some extra features like numbers, more function, help function, advanced editing etc.
Tensor flow is an open source machine learning python library which is created by Google Brain Team. This library is used to develop, train and design deep learning models. It can be used to do numerical computation and it is alternative of Theano. It can run on mobile devices, single CPU system and on GPU too.
In an ideal world, we would have perfectly balanced datasets and we would all train models and be happy.
Unfortunately, the real world is not like that, and certain tasks favor very imbalanced data.
For example, when predicting fraud in credit card transactions, you would expect that the vast majority of the transactions (+99.9%?) are actually legit.
Training ML algorithms naively will lead to dismal performance, so extra care is needed when working with these types of datasets.
Fortunately, this is a studied research problem and a variety of techniques exist.
Imbalanced-learn is a Python package which offers implementations of some of those techniques, to make your life much easier.
It is compatible with scikit-learn and is part of scikit-learn-contrib projects.
The original Caffe framework has been widely used for years, and known for unparalleled performance and battle-tested codebase.
However, recent trends in DL made the framework stagnate in some directions.
Caffe2 is the attempt to bring Caffe to the modern world.
It supports distributed training, deployment (even in mobile platforms), the newest CPUs and CUDA-capable hardware.
While PyTorch may be better for research, Caffe2 is suitable for large scale deployments as seen on Facebook.
Dash, announced this year, is an open source library for building web applications, especially those that make good use of data visualization, in pure Python.
It is built on top of Flask, Plotly.js and React, and provides abstractions that free you from having to learn those frameworks and let you become productive quickly.
Fire is an open source library that can automatically generate a CLI for any Python project.
The key here is automatically: you almost don’t need to write any code or docstrings to build your CLI! To do the job, you only need to call a Fire method and pass it whatever you want turned into a CLI: a function, an object, a class, a dictionary, or even pass no arguments at all (which will turn your entire code into a CLI).
FlashText is a better alternative just for this purpose.
In the author’s initial benchmark, it improved the runtime of the entire operation by a huge margin: from 5 days to 15 minutes.
The beauty of FlashText is that the runtime is the same no matter how many search terms you have, in contrast with regexp in which the runtime will increase almost linearly with the number of terms.
With Pipenv, you specify all your dependencies in a Pipfile — which is normally built by using commands for adding, removing, or updating dependencies.
The tool can generate a Pipfile.lock file, enabling your builds to be deterministic, helping you avoid those difficult to catch bugs because of some obscure dependency that you didn’t even think you needed.
Images are everywhere nowadays and understanding their content can be critical for several applications.
Thankfully, image processing techniques have advanced a lot, fueled by the advancements in DL.
Luminoth is an open source Python toolkit for computer vision, built using TensorFlow and Sonnet.
Currently, it out-of-the-box supports object detection in the form of a model called Faster R-CNN.
This post is curated by IssueHunt that an issue based bounty for open-source projects.