19 Jul 2022

TDD: software design practice

Short backstory
While working at Launchyard, we were working on the project for already into 6-7 months. Then the company hired the Frank Wiles (Django core contributor) as a consultant  to review the project. And he pointed out that no tests has been written for backend code.  And we literally stopped the development for 1 and 1/2 months, and start writing tests for the already written code.
After this exercise, I realized that  lot of silly mistakes were caught (and fixed) at development stage, instead after deployment. That was a huge boost to me.  Now, I can write code and test it right away, wow that's great !!

Building as a practice
After I have been writing tests all along. There was a company that i joined, nobody was writing any tests. But I said to CTO, I will write tests for this project. And slowly whole team also adopted TDD after seeing the benefits of writing tests.

Sometimes into the job, has to start working on project huge in size (LOC) or with legacy code. It really helps to speed things up.

One time I just joined a company and it was 2nd day And this senior guy came to me and asked "we have to do minor change for this feature., can you check that ?"
I asked "we do not have to tests for this, how do we know if change is working as expected and also not breaking other things. I guess i have to understand this piece of code and then write tests before I can do expected change" .  The guy walked away looking unhappy !!.


One more scenario is that i have found TDD useful is fixing a bug, in right way.

                                                   

Benefits of TDD:
1. Verify the behavior and correctness of the code.
2. Avoid introducing breaking changes.
3. Support refactoring.
4. Shipping code with confidence and velocity.

I have been part of couple of early stage startups with small teams. And we always started doing TDD, as a practice.  Not as a a optional approach.  Right now TDD is not novelty to have, I see this a necessity of software design practice that team needs to adopt, the sooner the better.

25 Oct 2021

Thoughts On Tech Debt

 A technical debt is burden on codebase and on people as well.  In this post i am going to share my experience on the topic. if the codebase or architecture of the system does not allow to meet the future requirements with high pace and quality code, then there is a tech debt into the system.

I want to see this from two angles one, from code and other from people.

Code: Tech Debt

1. There is tight coupling between the different components.
2. There is existing code which is working fine, but with new requirements you can make it work with changes in existing code, by adding one if conditional check or by adding new parameter.
3. Code duplication, changes needs to be made at multiple places,
4. Not writing commonly used code at one place.
5. Over-Engineering comes with the cost.
6. Critical part of the system missing testing.

There could be N number of other points which can be added to above list. A developer having good sense of software design and principle software can avoid most of the above mentioned points. And yeah, always willing to refactor, even when no one is asked to do it.

 Person: Tech Debt

1. A developer should be willing to take pride in this every line of code, if there is lack of passion in here. Then certainly slowly slowly debt will creeps into the system. I am not saying conscious person will not be having tech debt, but if given opportunity he will always make effort to reduce the debt.
Certainly developer of conscious of producing quality work, helps.

2. Team lead or manager should be willing to understand the fact that tech debt is a roadblock of adding new features very quickly. First they should accept it and ask developer to make a plan to mitigate debt, that's a good start. And if they do not accept, then will sooner or later but the hard way.

3. Product owner should plan activities in way that some room is left to deal tech-debt.

4. There are situation where we have "to ship a feature, quickly",  its fine. But later one should come back later and have a look again. Here comes the motivation of developer, if one is not motivated enough then he will continue with next task.


I have been to workplaces where developer are aware of tech-debt and willing to do something about it, but management not interested in development team spending time in using the tech-debt.
And to workplace where management planned their sprint keeping tech-debt in mind or focuses entire sprint on handling tech-debt.

Intent is very important either from developer or product owner, If it is there one can work on towards reducing the tech debt. But in good scenario, both should be having a intent to reduce the tech-debt, as a team then I guess good progress can be made.

Thanks for reading.

29 Jul 2021

Published Elixir Library

Recently, I published the library, moviematch, on Hex. I was working on this, on and off. Sometimes motivation was up and sometimes down, but that's Ok i guess. Here is a post I wrote about creating this library. 


It is a small tool, with a "Scratch your own itch" kind of purpose. It is still very basic, will be iterating over some time. 


If you maintain the IMDb's watchlist, try this out. If you have any ideas for the feature or found some bug, please do add issues in GitHub.




8 Aug 2019

positional-only-parameter in Python

In recent release of python 3.8, new feature  positional-only-parameter is introduced.
I am going to cover motivations behind positional-only-parameter.
There are already some functions which takes positional arguments only like range(), id(), len().

def add(x, y):
    return x+y


Motivations:
1. Programmer does not want to expose the arguments to caller.
  in add(x=3, y=4)  x and y directly exposed to caller.

2. Tomorrow if programmer want to refactor the code and changes parameters name then existing call to the function will break.
if add(x, y) is changed to add(a, b) then add(x=3, y=4) will break. As a result of this programmer has to maintain the parameter's name.

3. add(*args, **kwargs) is not clear to caller as *args can take number of arguments without specifying any meaning of those.

4 When subclass overrides method of base class and changes the name of parameters.
class Sup(object):
  def add_one(self, x)
    return x+1

class Sub(Sup):
  def add_one(self, y)
    return y+1

Sub().add_one(x=1)

which results in error:
TypeError: add_one() got an unexpected keyword argument 'x'

5. Maintaining the logical order of arguments.
def get_args(a, b, c):
   return (a, b, c)

get_args(a=1, c=3, b=2)
get_args(c=9, a=7, b=5)


We can clearly see order of arguments is not followed in calls. positional-only-parameter will allow to maintain the order of calls.


Hope you like this article, Thanks for reading.

29 Mar 2019

Feature request for python's requests library

Recently as apart of day job I had to count the number of API requests were made to keep check of throttling limits. As a standard for python codebase, requests library has been used.
I wrote some code which will maintain the API URLs and their respective counts.
That got thing done.

After a while I realized, there should be a way to intercept the request. Just like in there is middleware in Django for request. Then checked the documentation of requests and found event-hooks.

Going through the docs found out that hook is available only for response.  And they have intentions to add other hooks as well mentioned in TODO.
Found this opportunity to contribute one of the best libraries of python and without any delay I created github issue

Hopefully library maintainers will have a look and will add as a feature request officially.

8 Mar 2019

Learning Elixir with side project

Recently I have been attracted to functional programming , the idea of creating an application which is entirely composed of functions with no side-effects and working with immutable data structures.
Just bumped into Elixir, tried couple of days and yeah really liked it.

How to learn
Idea is to identify problem(s) which I care about and solve those, it could very tiny thing.
This way I need not to struggle with motivation to keep carry on.

Identifying problem

I maintain IMDB watchlist and look for torrent sites if that movie from my watchlist is available or not. It's a mundane thing and i do not want spend time just to check that.
So I choose this to automate that and start my learning path for Elixir.

Building project
I created issues on project and started work on those.
One part of my brain telling me  is it correct way to do ? Other part was saying just make it work.  
More so coming from python background and writing code in it for more than six years, first thing comes to mind while writing code is to make pythonic since I know how to make it work.
But with new language had to suppress the urge to write idiomatic code from Day 1,  I can improve it while doing refactor.
It took a couple of weeks doing small improvements and iterating over. Whenever I was stuck on something took help from StackOverflow.

Learnings

1. Pattern matching is super cool. It really helps to write clean code.
2. Pipe operator. Another programming construct which helps to write clean code.
3. Mix, a build tool. Need to explore it more.
4. Tried out third-party libraries like HTTPoison, Floki and others.
5. Learning ideas from Joe Armstrong thesis

Feedback
Once the stable version of project is build, I decided to take feedback from elixir community.I approached  elixir-langelixir-forum, gitter channels  lisbon-elixir, awesome-elixir,
local-elixir-meetup-group
Really appreciate their time and feedback on project.

Thanks for reading.

11 Jun 2018

JWT: how token is created

In this post i am going to give a thoughts about how JWT token is got created. If do not aware of JWT token and its use-cases then please read this and come back.


Ok, now you have basic introduction about JWT token so i can go ahead of steps that involves creating token.

Let say we are creating token with followings:

algorithm is HS384
payload is {'some': 'payload'}
secret key is "secret"
and
header is {"custom_header": "custom_val"}


So lets follow through the creation of JWT token.

 1. algorithm is checked if it is a supported or not.

2. payload is encoded in utf-8 and converted into a byte string.
     b_payload = b'{"some":"payload"}'


3. header is encoded in utf-8 and converted into a byte string.
    b_header = b'{"typ":"JWT", "alg":"HS384", "custom_header": "custom_val"}'

4. b_payload is encoded into a base64 encoding:
    en_payload = base64_encode(b_payload)
    looks like
    b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzM4NCIsImhrIjoiaGVhZC12YWwifQ'

5. b_header is encoded into a base64 encoding
    en_header = base64_encode(b_header)
    looks like
    b'eyJzb21lIjoicGF5bG9hZCJ9'

6.  A signing string is created by concatenating en_payload and en_header with dot (.)
    signing_input = b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzM4NCIsImhrIjoiaGVhZC12YWwifQ' .b'eyJzb21lIjoicGF5bG9hZCJ9'

7. signature is created by signing an algorithm with signing_input and key
    signature = sign(signing_input, key)     # alg_obj is an algorithm object
    looks like
    b'\xc1\x7f\x7f\xfb\x96\xb3\x0fc\x1e\x84.\x02\xe5\xf5\xfd\xbb\xb2\x9bf0\x9ea\xec\x06U\x15-]\xca;\x1f\xfb\xa6J\xc7pv\xdf\x0cu;j`o\xa6ia\x9d'

8. Now, signature is encoded into a base64 encoding
   en_signature = base64_encode(signature)
   looks like
   en_sign = b'wX9_-5azD2MehC4C5fX9u7KbZjCeYewGVRUtXco7H_umSsdwdt8MdTtqYG-maWGd'

9. And finally all three components, en_payload , en_header and en_signature,  are concatenated by with dot (.) which results in token
   looks like
 b'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzM4NCIsImhrIjoiaGVhZC12YWwifQ.eyJzb21lIjoicGF5bG9hZCJ9.wX9_-5azD2MehC4C5fX9u7KbZjCeYewGVRUtXco7H_umSsdwdt8MdTtqYG-maWGd'


This is very basic of how token is created,  hope you find it useful.

7 Jun 2018

On the fly: namedtuple

Python has a namedtuple from collections module.  It is interesting to look how it got created because it is created for user defined name.


namedtuple(typenamefield_names, verbose=False, rename=False)
for example:
point_obj = namedtupe('Point', ['x', 'y'])


I would like to share few observations about namedtuple after going through the source code of same.

1. a class definition is stored as a string literal which is to be executed.

2. A subclass of tuple is created with name as a typename. i.e.: point
     class Point(tuple):
        pass

3. property and operator.itemgetter applied on field_names. the Point will have x and y as a property
x and y will be attributes of point.

4. A class definition is created which involves 2. and 3. which looks like this.

5. exec() executes the class definition which returns object of Point.


You can have look at this code,  thanks for reading.

13 Nov 2017

Django Internal: create()

A model instance is corresponds to a record in the database once saved. In this post i am going to
explore what happens behind the scene when you call  create(). It create a instance of model and save that instance which creates a record in the database.

let say we have a model defined as:

class Author(models.Model):
   name = models.CharField(max_length=100)
   age = models.IntegerField()


Lets go step by step to deconstruct Author.objects.create()

Creating instance of model

 instance = Author(name="roy", age=47)

1. signals invoked
Once starts it send signal pre_init and on successful completion of this method post_init is called.

2. check for given model fields not values.
check can fail if you more arguments than number of fields or attribute which does not present on model.
for example:
instance = Author(name="roy", age=47, code='python')
TypeError: 'code' is an invalid keyword argument for this function.
Each model has attribute called ._meta through which gives related details for models



As for information the models.Model  has an abstract base class called ModelBase which add attributes on a model itself like ._meta, errors (DoesNotExist, MultipleObjectsReturned) and validating model itself for its concrete definition.


Saving model instance
model method ._save_base() is called by save() to save model instance.  this where pre_save() and post_save() signals are invoked.

There are couple of important actors involved in here, I would like brief them.
 1. InsertQuery: instance of class is initiated for actual SQL' s raw query by passing model fields objects and values.
2. SQLInsertCompiler: This is SQL compiler for InsertQuery. It is a place where django models fields and values are parsed into SQL raw statement.
3. DatabaseWrapper: It is an wrapper for
mysql-python client. It creates connection to SQL server and uses cursor to execute the raw SQL statement.

Queryset private method _insert() is place where things kick off at low level.  creating a record means INSERT SQL query. This happens inside a atomic transaction.
Basically django creates insert query gives that to compiler and compiler invoke database's wrapper method execute() by passing raw SQL statement as

     cursor.execute('INSERT INTO `author` (`name`, `age`) VALUES (%s, %s)', ['roy', 46])

Once above code is executed successfully record is being is created in database. And thereafter post_save signal is invoked. And that s how an model instance creates an record in database. 

Hope you found it interesting, keep reading.

14 Oct 2017

Django Manager's methods

In this post i am going to explore how django.db.models.manager.Manager class supports queryset methods.

>> Book.objects.all()           # returns a queryset.
>> Book.object.filter(author='george-orwell' )        # returns all books by author george-orwell.
>> Book.objects.delete()    # deletes all instances.

 Here Book.objects is an instance of Manager class django.db.models.manager.Manager.
Ok, which means all these methods should be defined for this class.  No, all these methods are not defined for manager class but for Queryset class.

Manager class inherits from  django.db.models.manager.BaseManagerFromQuerySet class.
And magic is there is no such class as BaseManagerFromQuerySet.
That class  is dynamically generated by method from_queryset of class BaseManager.
>> BaseManager.from_queryset(QuerySet)

This is where all magic happens.
What it does is it takes all the methods of QuerySet and dynamically attach all these methods to a newly created class BaseManagerFromQuerySet. And this class is used as a base class for Manager class.

And that' s how filter(), delete() etc. can be invoked on manager.



Hope you guys find it useful.

Reference:
manager.py

10 Oct 2017

python: new-style and old-style class

A new-style class inherits from object (or type) where as old-style class does not. In python 3.0 old-style class is removed.

In this post I am going to explore differences between the two.

1: types













Here we can seen that type of obj is not of class but of built-in type instance.  Attribute __class__ of obj gives information about class to which it belongs.
So type and __class__ attributes differ for old-style class.

Lets see what happens with New-style class (python 3.6): 















type and __class__ attributes does not differ for new-style class.


2: MRO (Method Resolution Order)

In addition to above differences, old-style class does not support mro.












whereas new-style class does have that:






Also one can see that new-style class inherits from a object.


3: super
super() does not works with old-style classes.

Works with new-style















the reason being is that super() expect type object for class In case of old-style type(ABC) is classobj not type But in case of new-style type(NewABC) is type.

4: descriptor protocol
New-style classes have implemented descriptor protocol whereas old-style does not.


5: raising an exception
In new style objects cannot be raised as exception unless derived from Exception.










Thanks for reading.

2 Oct 2017

repr and str in python

The purpose of repr() and str() is to display information about object. But there is slight distinction between the two. In this post we will explore that.

By definition repr() method is official representation of object. That means if i want to create an object repr should be descriptive enough to give that information.
Lets take an example of datetime object.

>> from datetime import datetime
>> now_datetime = datetime.now()
>> repr(now_datetime)

prints

>> 'datetime.datetime(2017, 10, 2, 11, 0, 7, 382330)'

so if a want to create a new datetime object repr should be referred for that matter.

new_datetime = datetime.datetime(2017, 11, 12, 15, 56, 9, 343831)

Thats exactly the purpose of repr.

As per doc str() is a 'informal' representation of object. It has be string and you can put whatever you feel like.

>> str(now_datetime)
>> '2017-10-02 11:00:07.382330'


Now the difference
str() it should only return valid string object whereas repr() should either return string or a python expression. 
In case of python expression which can be used to create object.


>> make_this_repr = "x+1"
>> repr(make_this_repr)
>> "'x+1'"
>> x =10
>> eval(make_this_repr)
>> 11


 
Hope you find it useful, thanks for reading.

13 Apr 2017

Full-Text Search in MySQL

In this post i am going to explore basic of this feature. Following is the table i am going to use for FULL-TEXT search explanation.

SELECT * FROM articles;
+—-+———————–+——————————————+
| id | title | body |
+—-+———————–+——————————————+
| 1 | MySQL Tutorial | DBMS stands for DataBase … |
| 2 | How To Use MySQL Well | After you went through a … |
| 3 | Optimizing MySQL | In this tutorial we will show … |
| 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. … |
| 5 | MySQL vs. YourSQL | In the following database comparison … |
| 6 | MySQL Security | When configured properly, MySQL … |
| 7 | Database Theory | I am going to teach you database theory |
+—-+———————–+——————————————+

Whle creating table i have used FULLTEXT (title,body) to allow full text search on these columns. Behind the scene it creates a index on these columns.

Lets do full-text search for string ‘database’.

SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE);
+—-+——————-+——————————————+
| id | title | body |
+—-+——————-+——————————————+
| 7 | Database Theory | I am going to teach you database theory |
| 1 | MySQL Tutorial | DBMS stands for DataBase … |
| 5 | MySQL vs. YourSQL | In the following database comparison … |
+—-+——————-+——————————————+

FULL TEXT allows different mode to run search query, here it is IN NATURAL LANGUAGE MODE.

  • It means search string is a natural human language.

  • By default full-text search is case-insensitive.

  • Minimum length of word to be found is 3, which can be manipulated with innodb_ft_min_token_size .
  • The returned rows are sorted with highest relevance first. relevance is kind of score is calculated by MYSQL internally.

Above query search search string in title and description. Now let search in title only.
SELECT * FROM articles WHERE MATCH (title) AGAINST ('database' IN NATURAL LANGUAGE MODE);
since we have defined the full-text search on title and body, To search only in column body we need to define FULL-TEXT index on that column.

Lets now do full-text search for string ‘to’.

SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('to' IN NATURAL LANGUAGE MODE);

No result, but ‘to’ sting exist in body with id=7 ?
This is because full-text search will skip certain words, which are refereed as stop-words.
You can find out list for stop-words like:
SELECT * FROM INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD;
+——-+
| value |
+——-+
| a |
| about |
| an |
| are |
| as |
| at |
| be |
| by |
| com |
| de |
| en |
| for |
| from |
| how |
| i |
| in |
| is |
| it |
| la |
| of |
| on |
| or |
| that |
| the |
| this |
| to |
| was |
| what |
| when |
| where |
| who |
| will |
| with |
| und |
| the |
| www |
+——-+

What if you want specific case of search, like get all result having string ‘database’ and not having ‘MySQL’. . That is where BOOLEAN mode of full-text search helps.

SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('+database -MySQL' IN BOOLEAN MODE);
+—-+—————–+—————————————–+
| id | title | body |
+—-+—————–+—————————————–+
| 7 | Database Theory | I am going to teach you database theory |
+—-+—————–+—————————————–+

more boolean modes:

  • find all rows which have atleast one of ‘MySQL’ and ‘YourSQL’
    SELECT * FROM articles WHERE MATCH (title,body) AGAINST (‘MySQL YourSQL’ IN BOOLEAN MODE);
    +—-+———————–+——————————————+
    | id | title | body |
    +—-+———————–+——————————————+
    | 5 | MySQL vs. YourSQL | In the following database comparison … |
    | 6 | MySQL Security | When configured properly, MySQL … |
    | 1 | MySQL Tutorial | DBMS stands for DataBase … |
    | 2 | How To Use MySQL Well | After you went through a … |
    | 3 | Optimizing MySQL | In this tutorial we will show … |
    | 4 | 1001 MySQL Tricks | 1. Never run mysqld as root. 2. … |
    +—-+———————–+——————————————+

  • find rows that contains both words
    SELECT * FROM articles WHERE MATCH (title,body) AGAINST (‘+MySQL +YourSQL’ IN BOOLEAN MODE);
    +—-+——————-+——————————————+
    | id | title | body |
    +—-+——————-+——————————————+
    | 5 | MySQL vs. YourSQL | In the following database comparison … |
    +—-+——————-+——————————————+

Similar kind of search can be made using Boolean mode.

So this was very basic of full-text search, hope you find it helpful.