Tira espaços / tabs / novas linhas - python

Question 1

Estou tentando remover todos os espaços / tabs / novas linhas em Python 2.7 no Linux.

Eu escrevi isso, isso deve fazer o trabalho:

myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = myString.strip(' \n\t')
print myString

resultado:

I want to Remove all white   spaces, new lines 
 and tabs

Parece uma coisa simples de se fazer, mas estou perdendo algo aqui. Devo importar algo?

Question 2

Use str.split([sep[, maxsplit]])sem sepou sep=None:

Dos documentos :

Se sepnão for especificado ou for None, um algoritmo de divisão diferente será aplicado: execuções de espaços em branco consecutivos são consideradas como um único separador e o resultado não conterá strings vazias no início ou no final se a string tiver espaços em branco à esquerda ou à direita.

Demo:

>>> myString.split()
['I', 'want', 'to', 'Remove', 'all', 'white', 'spaces,', 'new', 'lines', 'and', 'tabs']

Use str.joinna lista retornada para obter esta saída:

>>> ' '.join(myString.split())
'I want to Remove all white spaces, new lines and tabs'

Question 3

Se você deseja remover vários itens de espaço em branco e substituí-los por espaços simples, a maneira mais fácil é com uma expressão regular como esta:

>>> import re
>>> myString="I want to Remove all white \t spaces, new lines \n and tabs \t"
>>> re.sub('\s+',' ',myString)
'I want to Remove all white spaces, new lines and tabs '

Você pode então remover o espaço à direita com .strip()se desejar.

Question 4

Use a nova biblioteca

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
myString = re.sub(r"[\n\t\s]*", "", myString)
print myString

Resultado:

Desejo remover todos os espaços em branco, novas linhas e tabelas

Question 5

import re

mystr = "I want to Remove all white \t spaces, new lines \n and tabs \t"
print re.sub(r"\W", "", mystr)

Output : IwanttoRemoveallwhitespacesnewlinesandtabs

Question 6

Isso removerá apenas a guia, novas linhas, espaços e nada mais.

import re
myString = "I want to Remove all white \t spaces, new lines \n and tabs \t"
output   = re.sub(r"[\n\t\s]*", "", myString)

RESULTADO:

IwantoRemovereallwhiespaces, newlinesandtabs

Dia bom!

Question 7

As soluções acima sugerindo o uso de regex não são ideais porque esta é uma tarefa pequena e regex requer mais sobrecarga de recursos do que a simplicidade da tarefa justifica.

Aqui está o que eu faço:

myString = myString.replace(' ', '').replace('\t', '').replace('\n', '')

ou se você tivesse um monte de coisas para remover de forma que uma solução de linha única seria desnecessariamente longa:

removal_list = [' ', '\t', '\n']
for s in removal_list:
  myString = myString.replace(s, '')

Question 8

Como não há nada mais complexo, gostaria de compartilhar isso porque me ajudou.

Isso é o que eu usei originalmente:

import requests
import re

url = '/programming/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
print("{}".format(r.content))

Resultado indesejado:

b'<!DOCTYPE html>\r\n\r\n\r\n    <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive">\r\n\r\n    <head>\r\n\r\n        <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>\r\n        <link

Isso é o que eu mudei para:

import requests
import re

url = '/programming/10711116/strip-spaces-tabs-newlines-python' # noqa
headers = {'user-agent': 'my-app/0.0.1'}
r = requests.get(url, headers=headers)
regex = r'\s+'
print("CNT: {}".format(re.sub(regex, " ", r.content.decode('utf-8'))))

Resultado desejado:

<!DOCTYPE html> <html itemscope itemtype="http://schema.org/QAPage" class="html__responsive"> <head> <title>string - Strip spaces/tabs/newlines - python - Stack Overflow</title>

O regex preciso que @MattH mencionou foi o que funcionou para mim encaixá-lo em meu código. Obrigado!

Nota: este é python3